Re: RPM sqlite3 support

2016-01-14 Thread James Olin Oden
On Wed, Jan 13, 2016 at 12:10 PM, Jeff Johnson  wrote:


>
> Most of the applications that use an rpmdb appear to be a 1-time data
> scrape of an rpmdb
> (libguestfs with db_dump(1) and what has been done in yum/zypper in the
> past). A 1-time
> data scrape of an entire rpmdb perhaps indicates that no database
> whatsoever in rpm
> is what developers wish: there is no reason I know that a pile of *.rpm
> files cannot replace
> the /var/lib/rpm/Packages store and leave indexing to applications (which
> are doing the indexing
> into the data scrape already).
>
> *shrug*
>
> Exactly.   BTW, this sort of storage is not to far from the way nosql type
databases like mongodb stores things.   You essentially store documents
that are quickly retrievable, and then the indexing on the document's data
is done in the application (and sometimes with another database).

Either way though it would be a shame for RPM to lose it's ACID properties.
  That said the world is moving to doing rollbacks through a filesystem
snapshot (which I just noted was in RHEL 7 today) so maybe ACID becomes far
less important.

...James


Re: RPM sqlite3 support

2016-01-13 Thread Jeff Johnson

On Jan 13, 2016, at 6:34 AM, devzero2000 wrote:

> 
> On Tue, Jan 12, 2016 at 10:06 PM, Mark Hatle <mark.ha...@windriver.com> wrote:
> On 1/12/16 2:06 PM, Tim Mooney wrote:
> > In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 12:46pm 
> > on...:
> >
> >> The sqlite3 code (and support) in rpm5 was abandoned in favor of
> >> Berkeley DB ACID transactional support quite some years ago
> >
> > I've been meaning to ask about this for a while, and this provides a
> > good segue...
> >
> > With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
> > calling it) to AGPL, does that impact rpm5's long-term use of BDB?
> 
> I know we and many of our commercial customers has rejected BDB 6.x because of
> the change, so we've been forced to 'support' BDB 5.x.
> 
> Personally I would like to get rid of anything that has to do with BerkeleyDB,
> just to get rid of this or future license questions from customers.  (But
> unfortunately I don't know what can reasonably replace BDB.)
> 
> Hello 
> 
> FWIW, @rpm.org started a plan to replace the rpmdb format
> 
> https://fedoraproject.org/wiki/Changes/NewRpmDBFormat
> 
> Here the original announcement
> 
> http://comments.gmane.org/gmane.linux.redhat.fedora.devel/215427
> 
> Currently no details are known (to me)

There aren't many details, certainly the reasons for change are not clearly
expressed in the description:

The current implementation of the RPM Database is based on Berkeley DB. 
There are doubts about the its future
 and level of maintenance. In addition rpm's use of the database has 
multiple issues on its own. As a result RPMx
 upstream is working to replace the database format with a new 
implementation.

The implementation (assuming rpm/lib/backend/ndb is the new format) is a rather
straightforward hash based database afaict. I am not seeing the necessary
interconversion tools between the formats, not hard to write when the time 
comes.
I'm also not seeing provisions for transactions and ACID and caching and
all the other sophistication/complexity that Berkeley DB provides, but
perhaps I am not appreciating the ndb implementation from a single
pass through the code.

There is also no obvious signs of remote access and/or replication. The API is
simple enough that any form of RPC will serve for remoting. OTOH, there's
much much more to replication than RPC.

The discussions are mostly legitimate concerns about compatibility and its 
pretty
clear that the proposed solution for other projects is
Use rpmlib API's if you wish compatibility with an rpmdb.
Which isn't bad advice, just opposite from the entire intent of using a 
"standard"
API in Berkeley DB way back when so that other/better implementations than rpm
could be developed (of course the main problem there is that the Packages 
database
contains a header blob rather than simple/rich data types that 
applications/developers expect).

Most of the applications that use an rpmdb appear to be a 1-time data scrape of 
an rpmdb
(libguestfs with db_dump(1) and what has been done in yum/zypper in the past). 
A 1-time
data scrape of an entire rpmdb perhaps indicates that no database whatsoever in 
rpm
is what developers wish: there is no reason I know that a pile of *.rpm files 
cannot replace
the /var/lib/rpm/Packages store and leave indexing to applications (which are 
doing the indexing
into the data scrape already).

*shrug*

C+ would be my grade for what I am seeing implemented in rpm/lib/backend/ndb

hth

73 de Jeff
> 
> Elia 
>  
> 
> --Mark
> 
> > Tim
> >
> 
> __
> RPM Package Managerhttp://rpm5.org
> User Communication List rpm-users@rpm5.org
> 



Re: RPM sqlite3 support

2016-01-12 Thread Jeffrey Johnson

> On Jan 12, 2016, at 6:39 PM, Tim Mooney <tim.moo...@ndsu.edu> wrote:
> 
> In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 4:40pm on...:
> 
>>> With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
>>> calling it) to AGPL, does that impact rpm5's long-term use of BDB?
>>> 
>> 
>> No impact for the project, but there’s always users who want/need different.
>> 
>> BDB been doing the job for RPM (and many many other projects) since forever.
>> There’s little engineering reason to change.
> 
> Agreed, but then outside factors override a lot of technology design
> decisions in software design, for better or worse.
> 
> My main reason for asking relates to the fact that a lot of other projects
> have abandoned BDB, again likely not for engineering reasons.  I can
> envision a day when the "cool kids" look at software that relies on BDB
> and, not understanding the history, decide to write their own "better"
> alternative that uses the cool database of the current time.  If I hear
> anyone say MongoDB, I will almost certainly commit some type of crime.
> 

(aside)
You do realize that RPM5 has the mongo-c-driver embedded inside, batteries 
included,
for the past 5 years?

Seriously: most rpm users have nearly identical configurations and the
days where each and every client PC need their own speshul copy
of changelogs and descriptions and identical blobs of header metadata
are clearly numbered.

Note that rpm5 has also embedded sqlite (i.e so that sql updates can be
distributed through %post, and so that multiple copies of installed software
metadata in schema-du-jour can be used as one wishes).

*shrug* My job is to make implementations exist, not honk my warez. 

> The other concern is obviously the additional maintenance burden of having
> to also maintain the 5.x BDB codebase.  That base is pretty mature, so
> there shouldn't be a lot of need for fixes or security patches, but
> maintaining that will eventually become unpalatable.
> 

BDB has done an excellent job maintaining a consistent backward compatible API:
there is nothing whatsoever wrong with BDB 5.x as used by RPM.

Bundle BDB into RPM if you don’t want the added package monkey task maintaining
older versions of BDB as a package, license and batteries included.

73 de Jeff


> Tim
> -- 
> Tim Mooney tim.moo...@ndsu.edu
> Enterprise Computing & Infrastructure  701-231-1076 (Voice)
> Room 242-J6, Quentin Burdick Building  701-231-8541 (Fax)
> North Dakota State University, Fargo, ND 58105-5164

__
RPM Package Managerhttp://rpm5.org
User Communication List rpm-users@rpm5.org


Re: RPM sqlite3 support

2016-01-12 Thread Jeffrey Johnson

> On Jan 12, 2016, at 11:25 AM, Jate Sujjavanich  wrote:
> 
> I am using rpm 5.4.9 as my package manager on an embedded system, and I am 
> running into some bottlenecks with the Berkeley database. I am using an SD 
> Card as my root file system.
> 
> The rpm transactions from a kernel upgrade take an hour or so. Using the 
> --stats option, I found that rpm was spending 15 or so seconds total per 
> package in dbadd, dbget, etc. An strace revealed that many fsyncdata calls 
> were happening. It appears this is due to ACID-ity of transactions.
> 

Yes: fsync is especially costly on an SD card, and is measurably  slower
on other media as well. ACID doesn’t come for free.

A performance fix is usually to stub out sync (and all its variants) in the
vectors used by Berkeley DB to make OS calls when opening the
dbenv.

> I found some posts saying that sqlite could be used as the database backend, 
> so I tried this. I tried using configure to activate sqlite without success.
> 

Those posts were from years ago: transactional support was added in ~2010.

> I can compile with sqlite support, but so far, I can't get RPM to create any 
> sqlite files. I've tried database conversion, initializing an empty RPM 
> database, and setting _dbapi to 4 in /usr/lib/rpm/macros.
> 

There are significant changes for transactional support that have not been 
implemented
in the rpm sqlite3 code.

> I discovered that within configure.ac , DBAPI is hard 
> coded to 3 (Berkeley DB).
> 

Yes.

> I would like to know if sqlite3 as a backend is supported by RPM these days 
> in 5.4.9 or any other versions. I want to find out if some of the patches 
> within yocto/poky are breaking sqlite support.
> 

The sqlite3 code (and support) in rpm5 was abandoned in favor of
Berkeley DB ACID transactional support quite some years ago

I don’t know what patches are in poky/yocto specifically, but it should not
be too hard to find patches related to database functionality.

Try filing a bug report with poky/yocto, which will likely trigger a support 
request to me.

The performance on an SD card rather than the choice of backend database is
the issue in need of fixing.

hth

73 de Jeff


> Jate
> 



Re: RPM sqlite3 support

2016-01-12 Thread Jeffrey Johnson

> On Jan 12, 2016, at 3:06 PM, Tim Mooney <tim.moo...@ndsu.edu> wrote:
> 
> In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 12:46pm on...:
> 
>> The sqlite3 code (and support) in rpm5 was abandoned in favor of
>> Berkeley DB ACID transactional support quite some years ago
> 
> I've been meaning to ask about this for a while, and this provides a
> good segue...
> 
> With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
> calling it) to AGPL, does that impact rpm5's long-term use of BDB?
> 

No impact for the project, but there’s always users who want/need different.

BDB been doing the job for RPM (and many many other projects) since forever.
There’s little engineering reason to change.

(aside since its sure to be mentioned)
MDB could be adapted to RPM. The usage cases for OpenLDAP and RPM
are rather different, where OpenLDAP is a “lightweight” well defined schema 
server,
while rpm saves a header blob under an index with secondary lookup.

The sqlite code wasn’t very difficult and could be resurrected. The problem 
is/was one of design:
the sqlite3 code mimicked the BDB vectors with a primitive schema. What users 
are
expecting with SQL is a much richer schema. There was a very clever 
implementation
done at carb.org years ago that built schema du jour indices by internalizing 
rpm —query
in postgres and rebuilding tables as needed. The usage case there was web 
queries,
which are a very different application than doing install/remove (which are 
essentially just bulk
upgrades)

From an engineering POV, remapping the Packages store to a heap, possibly 
remote, or
possibly into an offset in the original package, might be useful if secondary 
lookup is
continued.

The next step to alternative Package stores would be to switch the package 
index from uint32_t
to 128 buts and use a UUID instead.

hah

73 de Jeff


> Tim
> -- 
> Tim Mooney tim.moo...@ndsu.edu
> Enterprise Computing & Infrastructure  701-231-1076 (Voice)
> Room 242-J6, Quentin Burdick Building  701-231-8541 (Fax)
> North Dakota State University, Fargo, ND 58105-5164
> __
> RPM Package Managerhttp://rpm5.org
> User Communication List rpm-users@rpm5.org

__
RPM Package Managerhttp://rpm5.org
User Communication List rpm-users@rpm5.org


Re: RPM sqlite3 support

2016-01-12 Thread Jate Sujjavanich
Performance was my whole rationale for trying sqlite3.

I do have some power backup, so I will try reducing the fsyncs.


On Tue, Jan 12, 2016 at 12:03 PM, Mark Hatle 
wrote:

>
> Note, sqlite support was -much- slower then BerkeleyDB.  It was only added
> to
> deal with people who didn't want to have the Sleepcat license conditions.
> Performance was definitely not the reason.
>
> If your device can live w/o the syncs, you can disable them -- mostly in
> the RPM
> configuration... just keep in mind if someone powers down in the middle of
> a
> transaction the data base is more likely to be messed up then with the
> fsyncs.
>
> --Mark
>
> > Jate
> >
>
> __
> RPM Package Managerhttp://rpm5.org
> User Communication List rpm-users@rpm5.org
>


Re: RPM sqlite3 support

2016-01-12 Thread Mark Hatle
On 1/12/16 10:25 AM, Jate Sujjavanich wrote:
> I am using rpm 5.4.9 as my package manager on an embedded system, and I am
> running into some bottlenecks with the Berkeley database. I am using an SD 
> Card
> as my root file system.
> 
> The rpm transactions from a kernel upgrade take an hour or so. Using the 
> --stats
> option, I found that rpm was spending 15 or so seconds total per package in
> dbadd, dbget, etc. An strace revealed that many fsyncdata calls were 
> happening.
> It appears this is due to ACID-ity of transactions.
> 
> I found some posts saying that sqlite could be used as the database backend, 
> so
> I tried this. I tried using configure to activate sqlite without success.
> 
> I can compile with sqlite support, but so far, I can't get RPM to create any
> sqlite files. I've tried database conversion, initializing an empty RPM
> database, and setting _dbapi to 4 in /usr/lib/rpm/macros.
> 
> I discovered that within configure.ac , DBAPI is hard 
> coded
> to 3 (Berkeley DB).
> 
> I would like to know if sqlite3 as a backend is supported by RPM these days in
> 5.4.9 or any other versions. I want to find out if some of the patches within
> yocto/poky are breaking sqlite support.

sqlite3 was used semi-maintained in older versions, but by 5.4.9 it had pretty
much bit rotted to not being functional.

Note, sqlite support was -much- slower then BerkeleyDB.  It was only added to
deal with people who didn't want to have the Sleepcat license conditions.
Performance was definitely not the reason.

If your device can live w/o the syncs, you can disable them -- mostly in the RPM
configuration... just keep in mind if someone powers down in the middle of a
transaction the data base is more likely to be messed up then with the fsyncs.

--Mark

> Jate
> 

__
RPM Package Managerhttp://rpm5.org
User Communication List rpm-users@rpm5.org


Re: RPM sqlite3 support

2016-01-12 Thread Tim Mooney

In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 12:46pm on...:


The sqlite3 code (and support) in rpm5 was abandoned in favor of
Berkeley DB ACID transactional support quite some years ago


I've been meaning to ask about this for a while, and this provides a
good segue...

With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
calling it) to AGPL, does that impact rpm5's long-term use of BDB?

Tim
--
Tim Mooney tim.moo...@ndsu.edu
Enterprise Computing & Infrastructure  701-231-1076 (Voice)
Room 242-J6, Quentin Burdick Building  701-231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164
__
RPM Package Managerhttp://rpm5.org
User Communication List rpm-users@rpm5.org


Re: RPM sqlite3 support

2016-01-12 Thread Mark Hatle
On 1/12/16 2:06 PM, Tim Mooney wrote:
> In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 12:46pm on...:
> 
>> The sqlite3 code (and support) in rpm5 was abandoned in favor of
>> Berkeley DB ACID transactional support quite some years ago
> 
> I've been meaning to ask about this for a while, and this provides a
> good segue...
> 
> With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
> calling it) to AGPL, does that impact rpm5's long-term use of BDB?

I know we and many of our commercial customers has rejected BDB 6.x because of
the change, so we've been forced to 'support' BDB 5.x.

Personally I would like to get rid of anything that has to do with BerkeleyDB,
just to get rid of this or future license questions from customers.  (But
unfortunately I don't know what can reasonably replace BDB.)

--Mark

> Tim
> 

__
RPM Package Managerhttp://rpm5.org
User Communication List rpm-users@rpm5.org


Re: RPM sqlite3 support

2016-01-12 Thread Tim Mooney

In regard to: Re: RPM sqlite3 support, Jeffrey Johnson said (at 4:40pm on...:


With Oracle's license change on BDB 6.x (or 12.x, or whatever they're
calling it) to AGPL, does that impact rpm5's long-term use of BDB?



No impact for the project, but there’s always users who want/need different.

BDB been doing the job for RPM (and many many other projects) since forever.
There’s little engineering reason to change.


Agreed, but then outside factors override a lot of technology design
decisions in software design, for better or worse.

My main reason for asking relates to the fact that a lot of other projects
have abandoned BDB, again likely not for engineering reasons.  I can
envision a day when the "cool kids" look at software that relies on BDB
and, not understanding the history, decide to write their own "better"
alternative that uses the cool database of the current time.  If I hear
anyone say MongoDB, I will almost certainly commit some type of crime.

The other concern is obviously the additional maintenance burden of having
to also maintain the 5.x BDB codebase.  That base is pretty mature, so
there shouldn't be a lot of need for fixes or security patches, but
maintaining that will eventually become unpalatable.

Tim
--
Tim Mooney tim.moo...@ndsu.edu
Enterprise Computing & Infrastructure  701-231-1076 (Voice)
Room 242-J6, Quentin Burdick Building  701-231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164