Re: sync suggestions [was Re: [gentoo-portage-dev] Few things, which imho would make portage better]

2006-03-15 Thread tvali
But if any user-side changes are assumed to be separated?

I mean if there is a boolean field user, which is triggered for
user-changed tables.

Or, to be simpler, i use 2 tables in my example.

Lets assume that user wants to change description of dev-lang/php --
so that user has to change dev-lang/php in table user-tree, but
leave the same in table portage-tree unchanged.

Of course, current example would make queries into portage-db more
complex than they should be, so more optimized version should be found
-- but, anyway, there are ways to make things work.


Imagine that (i take only ebuild files into consideration here):
* Portage tree is kept in SQL base, which contains the following fields:
** Id
** LongName (dev-db/mysql-4.1.14)
** Name (dev-db/mysql)
** ShortName (mysql)
** Slot
** Server -- which server or server group contains that ebuild (if
same ebuild is in several server, it should be repeated in SQL). If
empty, then this ebuild is created by user
** ServerStatus -- false if deleted from server (only used if UserInfo
is not NULL)
** Status -- if false, this row will not be used
** Current -- if this ebuild is what should be installed if emerge
mysql is written on this system
** Description
** /.../ -- other fields parsed from ebuild
** ServerInfo -- ebuild file from server
** UserInfo -- user additions to that file

* Dependency tree
** Will be updated from prev. table

Now, updates in server should be in the following form:
* Id
* ServerInfo
* Action -- add/delete/update

All other fields will be parsed out from Action in user's computer.

Any changes to portage tree will be then done via portage commands,
not directly to SQL.

2006/3/15, Brian Harring [EMAIL PROTECTED]:
 On Tue, Mar 14, 2006 at 03:50:18PM +0200, tvali wrote:
  Another question now is about sync.
 
  I did read somewhere, that this is not good user behavior to sync more
  than once per day. I understand that as if this is a huge download
  even if there is nothing changed.
 
  Isnt it nice idea to have this database just optimized?
 
  I mean (assuming portage using SQL now) -- that would be really simple
  to log every change in portage tree as series of SQL queries, which
  would reproduce this change.

 Pushing the delta (what you're suggesting) is only usable if it can be
 guranteed the user hasn't modified their tree at all (thus resulting
 in cache db differing from upstreams).

 That right there is the brass tacks of it; You wouldn't be able to
 push just the changes, you would have to regenerate the _whole_ db
 (slow, 20k inserts assuming only one table).

 Sidenote... please post seperate threads for seperate
 ideas/discussions, else it's damn hard to look back and pull the
 specific thread were something was discussed.
 ~harring






--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: esearch integration [was Re: [gentoo-portage-dev] Few things, which imho would make portage better]

2006-03-15 Thread tvali
Yes SQL tables are better for that as it's simpler to update them :)

2006/3/15, Brian Harring [EMAIL PROTECTED]:
 On Tue, Mar 14, 2006 at 04:33:06PM +0200, tvali wrote:
  I did think about it now and it seems to me that probably it would be
  much faster if esearch is not just another package, but part of
  portage.
 
  I mean -- functions of portage, which query db, should use esearch
  index wherever they need information, which exists in that index.
 
  As much as i can understand, /var/cache/edb/ contains esearch database
  in many files and esearchdb.py is search index as python script.

 No...
 esearch is a static db- only useful for 'frozen' trees, eg rsync
 distributed trees with no eclasses in overlays.  All cvs users (devs)
 run unfrozen trees (readonly/readwrite is better terminology), thus
 portage updates the cache db on the fly as needed.

 If esearch was integrated into portage the result would be stale
 metadata for cvs users, and stale metadata for rsync users when
 overlays with eclasses are involved- no go.

 That and esearch last I looked just generates a giant dict (thus the
 cache is in memory), which kind of blows the 25mb mem usage 2.1
 now sports :)

 ~harring





--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
2006/3/14, Simon Stelling [EMAIL PROTECTED]:
tvali wrote: * Add package to world checking all dependencies, but not emerging * Remove package from world without unmergingUhm, why would you want that?

On my computer, building takes several hours on some packages. I have
used it in such way that it builds in several windows on the same time,
but i guess that that's not at all the best practice, especially with
some combinations of packages. So i would like to add all packages,
seeing all blocks and inter dependencies, then start building when
ready (and go out of home, when it builds, and be sure that it will do
it's best when i'm away).

 * Save all important messages into file instead of beeping (i may be away from home); optionally replay them at end of update
elog can do this.
Will check out.

 * When updating or installing new package, if something goes wrong, still emerge other packages, which havent failed packages as dependencies -- in my
 case, for example, when i do --update, for example, then the fact that update of first package fails does not definitely mean that i dont want to update others--resume

I know --resume, but that's not what i meant.

Imagine the situation:
emerge -p package1 package2

part1 (needed by package1)
part2 (needed by part3)
part3 (needed by package2)
part4 (needed by package2)
part5 (needed by both package1 and package2)
package1
package2

now i start emerge package1 package2 and go to work (or sleep or party or sauna).

part2 fails -- some bug or misconfiguration.

currently it means that when i came back home, i see that part1 is
built, but nothing more and 4-5 hours are actually just wasted.

There should be a way to have all those built when i came home:

part1 (needed by package1)
part4 (needed by package2)

part5 (needed by both package1 and package2)

package1

There should be an error message listing all others. Also, i would add
them to world and then emerge so that --update would try to emerge them
again, asking me about all previously failed packages when i start it,
if i want to retry this time (yes, no, all, none).
--Kind Regards,Simon StellingGentoo/AMD64 Developer--
gentoo-portage-dev@gentoo.org mailing list-- tvali(e-mail: [EMAIL PROTECTED]; msn: 
[EMAIL PROTECTED];icq: 317-492-912)Ühe eesti internetifirma lehel kohtasin tsitaati:If
you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in business
for fun or profit, what the hell are you doing here?Robert Townsend


Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
I did think about some priorities too, so that it could be perfect for me.

It should be possible to add package with a priority. I will give you an use case and explanation how i would use portage.

emerge --justadd kdebase-meta kicker --priority 10
emerge --justadd kdenetwork-meta kdeutils-meta kdeadmin-meta --priority 9
emerge --justadd kdeedu-meta kdetoys-meta kdegames-meta kdeartwork-meta --priority 1
emerge --justadd koffice-meta kdegraphics-meta kdemultimedia-meta --priority 5
emerge --justadd kdewebdev-meta kdevelop htmltidy kompare cervisia --priority 7
emerge --justadd kdepim-meta --priority 3
emerge --justadd kdeaddons-meta kde-meta --priority 20
emerge --changepriority kdeaddons-meta kde-meta --priority 0

Now, when i run emerge --update, it should:

* Sort packages in such order that packages with highest priority would be installed as soon as possible
* Start installing
* If something goes wrong, then skip this package and all, which depend
on it. This should be optional, if those, which dont depend on it, but
are needed by same package, will stay in their positions (go to end of
same priority packages) or go to end.

When i need to use my computer and need much resources, i will press ctrl-c, use it, then start (continue) emerge --update.

2006/3/14, tvali [EMAIL PROTECTED]:
2006/3/14, Simon Stelling [EMAIL PROTECTED]:

tvali wrote: * Add package to world checking all dependencies, but not emerging * Remove package from world without unmergingUhm, why would you want that?

On my computer, building takes several hours on some packages. I have
used it in such way that it builds in several windows on the same time,
but i guess that that's not at all the best practice, especially with
some combinations of packages. So i would like to add all packages,
seeing all blocks and inter dependencies, then start building when
ready (and go out of home, when it builds, and be sure that it will do
it's best when i'm away).

 * Save all important messages into file instead of beeping (i may be away
 from home); optionally replay them at end of update
elog can do this.
Will check out.

 * When updating or installing new package, if something goes wrong, still
 emerge other packages, which havent failed packages as dependencies -- in my
 case, for example, when i do --update, for example, then the fact that update of first package fails does not definitely mean that i dont want to update others--resume


I know --resume, but that's not what i meant.

Imagine the situation:
emerge -p package1 package2

part1 (needed by package1)
part2 (needed by part3)
part3 (needed by package2)
part4 (needed by package2)
part5 (needed by both package1 and package2)
package1
package2

now i start emerge package1 package2 and go to work (or sleep or party or sauna).

part2 fails -- some bug or misconfiguration.

currently it means that when i came back home, i see that part1 is
built, but nothing more and 4-5 hours are actually just wasted.

There should be a way to have all those built when i came home:

part1 (needed by package1)
part4 (needed by package2)

part5 (needed by both package1 and package2)

package1

There should be an error message listing all others. Also, i would add
them to world and then emerge so that --update would try to emerge them
again, asking me about all previously failed packages when i start it,
if i want to retry this time (yes, no, all, none).
--Kind Regards,Simon StellingGentoo/AMD64 Developer
--
gentoo-portage-dev@gentoo.org mailing list-- tvali(e-mail: 
[EMAIL PROTECTED]; msn: 
[EMAIL PROTECTED];icq: 317-492-912)Ühe eesti internetifirma lehel kohtasin tsitaati:If
you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in business
for fun or profit, what the hell are you doing here?Robert Townsend

-- tvali(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)Ühe eesti internetifirma lehel kohtasin tsitaati:If
you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in business
for fun or profit, what the hell are you doing here?Robert Townsend


Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
Another place, where it would be good, is that it could interact well with GUI and it's Apply button.

Another thing, what would be imho improved, is the speed of following functions:
* emerge -s, emerge -S
They should be cached somehow.

Also, i think that when speaking about *speed* of portage, which is
imho notably slow right now (speed of searching, building trees and so
on), sql-support should be considered. In /usr/portage/, i see many
folders and files, which tend to contain one or two words, lists of
hashes and so on. Actually i see relational database there. With many
caches and other optimizations it could be tuned fast, but having SQL
doing all that, it would be achieved very simply (maybe it would be a
jump to have all linux configurations in one SQL filesystem? ...ok,
just fantasy:))

There are several small and simple SQL databases. With such tool, big
parts of portage become unnessecary, but it's speed will become much
faster. This should be optional if one uses some tiny sql, mysql or big
clustered sql with nas and load-balancer, but by default some tiny
version of SQL should be built.

Also, i currently dont see a possibility to see dependency tree of installed package.2006/3/14, tvali [EMAIL PROTECTED]:
I did think about some priorities too, so that it could be perfect for me.

It should be possible to add package with a priority. I will give you an use case and explanation how i would use portage.

emerge --justadd kdebase-meta kicker --priority 10
emerge --justadd kdenetwork-meta kdeutils-meta kdeadmin-meta --priority 9
emerge --justadd kdeedu-meta kdetoys-meta kdegames-meta kdeartwork-meta --priority 1
emerge --justadd koffice-meta kdegraphics-meta kdemultimedia-meta --priority 5
emerge --justadd kdewebdev-meta kdevelop htmltidy kompare cervisia --priority 7
emerge --justadd kdepim-meta --priority 3
emerge --justadd kdeaddons-meta kde-meta --priority 20
emerge --changepriority kdeaddons-meta kde-meta --priority 0

Now, when i run emerge --update, it should:

* Sort packages in such order that packages with highest priority would be installed as soon as possible
* Start installing
* If something goes wrong, then skip this package and all, which depend
on it. This should be optional, if those, which dont depend on it, but
are needed by same package, will stay in their positions (go to end of
same priority packages) or go to end.

When i need to use my computer and need much resources, i will press ctrl-c, use it, then start (continue) emerge --update.

2006/3/14, tvali [EMAIL PROTECTED]:

2006/3/14, Simon Stelling [EMAIL PROTECTED]:


tvali wrote: * Add package to world checking all dependencies, but not emerging * Remove package from world without unmergingUhm, why would you want that?

On my computer, building takes several hours on some packages. I have
used it in such way that it builds in several windows on the same time,
but i guess that that's not at all the best practice, especially with
some combinations of packages. So i would like to add all packages,
seeing all blocks and inter dependencies, then start building when
ready (and go out of home, when it builds, and be sure that it will do
it's best when i'm away).

 * Save all important messages into file instead of beeping (i may be away
 from home); optionally replay them at end of update
elog can do this.
Will check out.

 * When updating or installing new package, if something goes wrong, still
 emerge other packages, which havent failed packages as dependencies -- in my
 case, for example, when i do --update, for example, then the fact that update of first package fails does not definitely mean that i dont want to update others--resume



I know --resume, but that's not what i meant.

Imagine the situation:
emerge -p package1 package2

part1 (needed by package1)
part2 (needed by part3)
part3 (needed by package2)
part4 (needed by package2)
part5 (needed by both package1 and package2)
package1
package2

now i start emerge package1 package2 and go to work (or sleep or party or sauna).

part2 fails -- some bug or misconfiguration.

currently it means that when i came back home, i see that part1 is
built, but nothing more and 4-5 hours are actually just wasted.

There should be a way to have all those built when i came home:

part1 (needed by package1)
part4 (needed by package2)

part5 (needed by both package1 and package2)

package1

There should be an error message listing all others. Also, i would add
them to world and then emerge so that --update would try to emerge them
again, asking me about all previously failed packages when i start it,
if i want to retry this time (yes, no, all, none).
--Kind Regards,Simon StellingGentoo/AMD64 Developer
--
gentoo-portage-dev@gentoo.org mailing list-- tvali(e-mail: 

[EMAIL PROTECTED]; msn: 
[EMAIL PROTECTED];icq: 317-492-912)Ühe eesti internetifirma lehel kohtasin tsitaati:If
you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, 

Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
Ok, i send a lot of them, but hopefully they're interesting :)

I did research a bit about adding SQL support to portage -- as much as
i see, mysql is smallest sql server, which could be emerged with
python module.

In beginning, i think that SQL database structure should be created,
which supports basic fields from ebuilds (DESCRIPTION, HOMEPAGE,
SRC_URI, LICENSE, SLOT, KEYWORDS, IUSE, DEPEND, RDEPEND).

Then emerge --createsql command should be set up, which adds data from
/usr/portage/ filetree into this sql database.

Then, portage_db files should be copied into new folder, where queries
will be brought partially into sql. Emerge --usesql should affect
sys.path = [/usr/lib/portage/pym]+sys.path in emerge.py so that user
can control, if she wants to use experimental sql functions. Search
and tree-building are first functions, which should get sql-support,
as they are slowest.

Objective of adding sql support would getting portage search and
pretend functions fast -- so that all real use of portage would be
without long waiting in any place -- only waiting would be in --update
command, then, which could be run when noone is nearby.

I have some questions now:
* Where i could find basic ebuild specification?
* Where i could find basic portage tree datastructure specification,
if there is any?
* Does portage have some code documentation for faster learning?
(or should i just read the code?)

I think that i will start with sql support, then go to UI -- as sql
support seems to have far more important functionality.

I would like to build sql table-structure after getting the
information and send it into this list here so that you could hek out
if it has something missing or poorly optimized. PS. i dont like rules
of database normalizing, so my strutures usually have some hacks in
them -- i hope that noone complains about that; normalized structures
are just dull and optimized only for dumb-user-human-readability,
which is imho not the biggest virtue of software code.

2006/3/14, tvali [EMAIL PROTECTED]:
 Another place, where it would be good, is that it could interact well with 
 GUI and it's Apply button.

  Another thing, what would be imho improved, is the speed of following 
 functions:
  * emerge -s, emerge -S
  They should be cached somehow.

  Also, i think that when speaking about *speed* of portage, which is imho 
 notably slow right now (speed of searching, building trees and so on), 
 sql-support should be considered. In /usr/portage/, i see many folders and 
 files, which tend to contain one or two words, lists of hashes and so on. 
 Actually i see relational database there. With many caches and other 
 optimizations it could be tuned fast, but having SQL doing all that, it would 
 be achieved very simply (maybe it would be a jump to have all linux 
 configurations in one SQL filesystem? ...ok, just fantasy:))

  There are several small and simple SQL databases. With such tool, big parts 
 of portage become unnessecary, but it's speed will become much faster. This 
 should be optional if one uses some tiny sql, mysql or big clustered sql with 
 nas and load-balancer, but by default some tiny version of SQL should be 
 built.

  Also, i currently dont see a possibility to see dependency tree of installed 
 package.



 2006/3/14, tvali [EMAIL PROTECTED]:
   I did think about some priorities too, so that it could be perfect for me.
 
   It should be possible to add package with a priority. I will give you an 
  use case and explanation how i would use portage.
 
   emerge --justadd kdebase-meta kicker --priority 10
   emerge --justadd kdenetwork-meta kdeutils-meta kdeadmin-meta --priority 9
   emerge --justadd kdeedu-meta kdetoys-meta kdegames-meta kdeartwork-meta 
  --priority 1
   emerge --justadd koffice-meta kdegraphics-meta kdemultimedia-meta 
  --priority 5
   emerge --justadd kdewebdev-meta kdevelop htmltidy kompare cervisia 
  --priority 7
   emerge --justadd  kdepim-meta --priority 3
   emerge --justadd kdeaddons-meta kde-meta --priority 20
   emerge --changepriority kdeaddons-meta kde-meta --priority 0
 
   Now, when i run emerge --update, it should:
 
   * Sort packages in such order that packages with highest priority would be 
  installed as soon as possible
   * Start installing
   * If something goes wrong, then skip this package and all, which depend on 
  it. This should be optional, if those, which dont depend on it, but are 
  needed by same package, will stay in their positions (go to end of same 
  priority packages) or go to end.
 
   When i need to use my computer and need much resources, i will press 
  ctrl-c, use it, then start (continue) emerge --update.
 
 
  2006/3/14, tvali [EMAIL PROTECTED]:
 
2006/3/14, Simon Stelling [EMAIL PROTECTED]:
  
   tvali wrote:
 * Add package to world checking all dependencies, but not emerging
 * Remove package from world without unmerging
   
Uhm, why would you want that?
   
  
  
On my computer, building takes 

Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
Thank you much!

This is what i meant about searching :) Would it be now good to mix
this esearch code with emerge, then adding fields to db and making
emerge treebuilding fast, too? Or is there some utility to emerge,
which does all that?

2006/3/14, Devon Miller [EMAIL PROTECTED]:
 Try emerge esearch. Replace emerge --sync with esync, emerge -s with
 esearch, and emerge -S with esearch -S.
 After running emerge --update ..., run eupdatedb.

 As for dependency tree: emerge --pretend --tree world
 Or, to also check additional dependencies: emerge --deep --pretend --tree
 world
 And, if you want to see how *everything* relates: emerge --deep --emptytree
 --pretend --tree

 dcm



 On 3/14/06, tvali [EMAIL PROTECTED] wrote:
 
  Another place, where it would be good, is that it could interact well with
 GUI and it's Apply button.
 
  Another thing, what would be imho improved, is the speed of following
 functions:
  * emerge -s, emerge -S
  They should be cached somehow.
 
  Also, i think that when speaking about *speed* of portage, which is imho
 notably slow right now (speed of searching, building trees and so on),
 sql-support should be considered. In /usr/portage/, i see many folders and
 files, which tend to contain one or two words, lists of hashes and so on.
 Actually i see relational database there. With many caches and other
 optimizations it could be tuned fast, but having SQL doing all that, it
 would be achieved very simply (maybe it would be a jump to have all linux
 configurations in one SQL filesystem? ...ok, just fantasy:))
 
  There are several small and simple SQL databases. With such tool, big
 parts of portage become unnessecary, but it's speed will become much faster.
 This should be optional if one uses some tiny sql, mysql or big clustered
 sql with nas and load-balancer, but by default some tiny version of SQL
 should be built.
 
  Also, i currently dont see a possibility to see dependency tree of
 installed package.
 
 
 
  2006/3/14, tvali  [EMAIL PROTECTED]:
   I did think about some priorities too, so that it could be perfect for
 me.
  
   It should be possible to add package with a priority. I will give you an
 use case and explanation how i would use portage.
  
   emerge --justadd kdebase-meta kicker --priority 10
   emerge --justadd kdenetwork-meta kdeutils-meta kdeadmin-meta --priority
 9
   emerge --justadd kdeedu-meta kdetoys-meta kdegames-meta kdeartwork-meta
 --priority 1
   emerge --justadd koffice-meta kdegraphics-meta kdemultimedia-meta
 --priority 5
   emerge --justadd kdewebdev-meta kdevelop htmltidy kompare cervisia
 --priority 7
   emerge --justadd  kdepim-meta --priority 3
   emerge --justadd kdeaddons-meta kde-meta --priority 20
   emerge --changepriority kdeaddons-meta kde-meta --priority 0
  
   Now, when i run emerge --update, it should:
  
   * Sort packages in such order that packages with highest priority would
 be installed as soon as possible
   * Start installing
   * If something goes wrong, then skip this package and all, which depend
 on it. This should be optional, if those, which dont depend on it, but are
 needed by same package, will stay in their positions (go to end of same
 priority packages) or go to end.
  
   When i need to use my computer and need much resources, i will press
 ctrl-c, use it, then start (continue) emerge --update.
  
  
   2006/3/14, tvali [EMAIL PROTECTED]:
  
2006/3/14, Simon Stelling [EMAIL PROTECTED]:
   
 tvali wrote:
  * Add package to world checking all dependencies, but not
 emerging
  * Remove package from world without unmerging

 Uhm, why would you want that?

   
   
On my computer, building takes several hours on some packages. I have
 used it in such way that it builds in several windows on the same time, but
 i guess that that's not at all the best practice, especially with some
 combinations of packages. So i would like to add all packages, seeing all
 blocks and inter dependencies, then start building when ready (and go out of
 home, when it builds, and be sure that it will do it's best when i'm away).
   
   
  * Save all important messages into file instead of beeping (i may
 be away
  from home); optionally replay them at end of update

 elog can do this.
   
   
Will check out.
   
   
  * When updating or installing new package, if something goes
 wrong, still
  emerge other packages, which havent failed packages as
 dependencies -- in my
  case, for example, when i do --update, for example, then the fact
 that
  update of first package fails does not definitely mean that i dont
 want to
  update others

 --resume
   
   
I know --resume, but that's not what i meant.
   
Imagine the situation:
emerge -p package1 package2
   
part1 (needed by package1)
part2 (needed by part3)
part3 (needed by package2)
part4 (needed by package2)
part5 (needed by both package1 and package2)
package1

Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Marius Mauch

tvali wrote:

Ok, i send a lot of them, but hopefully they're interesting :)

I did research a bit about adding SQL support to portage -- as much as
i see, mysql is smallest sql server, which could be emerged with
python module.

In beginning, i think that SQL database structure should be created,
which supports basic fields from ebuilds (DESCRIPTION, HOMEPAGE,
SRC_URI, LICENSE, SLOT, KEYWORDS, IUSE, DEPEND, RDEPEND).

Then emerge --createsql command should be set up, which adds data from
/usr/portage/ filetree into this sql database.


You're talking about the cache, take a look at the cache subsystem and 
write a mysql module for it. This will never become a default though (we 
would get killed if portage starts to depend on mysql).



I have some questions now:
* Where i could find basic ebuild specification?


Developer handbook, ebuild(5), bash(1)


* Where i could find basic portage tree datastructure specification,
if there is any?


Not really.


* Does portage have some code documentation for faster learning?
(or should i just read the code?)


code documentation ... what's that? ;)

Marius
--
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Marius Mauch

tvali wrote:

I did think about some priorities too, so that it could be perfect for me.

It should be possible to add package with a priority. I will give you an 
use case and explanation how i would use portage.


Heh, make the dep resolver even more complex ;)
Also don't really see a need for such a feature, pretty much no benefit 
with a lot of additional complexity.


Marius
--
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Alec Warner
tvali wrote:
 Ok, i send a lot of them, but hopefully they're interesting :)
 
 I would like to build sql table-structure after getting the
 information and send it into this list here so that you could hek out
 if it has something missing or poorly optimized. PS. i dont like rules
 of database normalizing, so my strutures usually have some hacks in
 them -- i hope that noone complains about that; normalized structures
 are just dull and optimized only for dumb-user-human-readability,
 which is imho not the biggest virtue of software code.

Uhhh you don't like normalizing your schemas?  Normalizing does two
things, reduces redundant data, and increases database speed as tables
are designed in a manner that minimalizes merging time.

Running with a schema you pulled out of your ass is likely to give poor
database performance.

-Alec Warner


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
 You're talking about the cache, take a look at the cache subsystem and
 write a mysql module for it. This will never become a default though (we
 would get killed if portage starts to depend on mysql).

I think that it should not become default as mysql module, but if it
is working, it should become default as portable sql module.

# emerge sqlite pysqlite

I havent used sqlite, but it seems to be small and usable. I think
that it should start with it.

I think that portage should *support* sql by default, but of course it
should not be default before it's clear that many people like it and
use it. What is imho more important is how to make one usable
interface, which would cover both fs and sql portage db's so that
development didnt go into two products.

--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread felix
On Tue, Mar 14, 2006 at 12:10:57PM +0100, Simon Stelling wrote:
 tvali wrote:

 * When updating or installing new package, if something goes wrong,
 still emerge other packages, which havent failed packages as
 dependencies -- in my case, for example, when i do --update, for
 example, then the fact that update of first package fails does not
 definitely mean that i dont want to update others

 --resume

No, make -k.  At least that's what I would dearly like to see.

-- 
... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
 Felix Finch: scarecrow repairman  rocket surgeon / [EMAIL PROTECTED]
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Brian
On Tue, 2006-14-03 at 16:33 +0200, tvali wrote:
 I did think about it now and it seems to me that probably it would be
 much faster if esearch is not just another package, but part of
 portage.
 
 I mean -- functions of portage, which query db, should use esearch
 index wherever they need information, which exists in that index.
 
 As much as i can understand, /var/cache/edb/ contains esearch database
 in many files and esearchdb.py is search index as python script.
 

I have been considering adding the esearch database and code into
porthole, or possibly a modified version of it.

 I think that SQL functions for searching and other things are maybe
 not much faster than py (as py seems to be fast enough, at least if
 there are more important things to do), but simpler to use. Therefore
 i think that rewriting the whole thing with SQL support may be still
 better.
 
 I will make it clear for me, what those emerge db functions do and
 still try to do something, what connects them to SQL.
 

If I recall, (there has been lots of discussion about converting portage
to use databases, just check the mail archives and forum) portage
already has sqlite support, but is not yet used.  Sqlite is smaller and
has less dependencies than mysql.


 Anyway, i am still interested, is there some code or other
 documentation about how portage works, especially how it keeps it's
 data?

Also, many of the features you talked about are already implemented in
porthole,  such as continuing after a failed package, filtering out
warnings, important messages, etc..

Check it out.

-- 
Brian [EMAIL PROTECTED]

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread solar
On Tue, 2006-03-14 at 15:50 +0200, Marius Mauch wrote:
 tvali wrote:
  I did think about some priorities too, so that it could be perfect for me.
  
  It should be possible to add package with a priority. I will give you an 
  use case and explanation how i would use portage.
 



 Heh, make the dep resolver even more complex ;)
 Also don't really see a need for such a feature, pretty much no benefit 
 with a lot of additional complexity.

agreed.


-- 
solar [EMAIL PROTECTED]
Gentoo Linux

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Johannes Fahrenkrug

tvali wrote:


I will consider what you sayd about db app design.

Anyway, i usually try to keep tables more dynamic and look at task at
hand, trying to make tables specially for it. When i tested
normalizing, i got about 60 tables where i had 5 without normalizing.
 

I'm not a Gentoo dev, but a programmer who deals with software and db 
design issues every day.
Normalizing your data structures keeps them - and the apps that use them 
- flexible.


Of course a table with fields like customernr, customername, item1, 
item2, item3 is easier to create and smaller
than one table for the customers and one for items. But what if there's 
a 4th and a 5th item? You have to change
your table and every place in your app that uses it (which should only 
be one).


I assume you're also not too fond of design patterns because some 
require you to create 5 classes for something you could do with one ;-)...


- Johannes.


--
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Brian
On Tue, 2006-14-03 at 17:32 +0200, tvali wrote:
 2006/3/14, Brian [EMAIL PROTECTED]:
  On Tue, 2006-14-03 at 16:33 +0200, tvali wrote:
  If I recall, (there has been lots of discussion about converting portage
  to use databases, just check the mail archives and forum) portage
  already has sqlite support, but is not yet used.  Sqlite is smaller and
  has less dependencies than mysql.
 
 How to use sqlite support in portage?
 
  Also, many of the features you talked about are already implemented in
  porthole,  such as continuing after a failed package, filtering out
  warnings, important messages, etc..
 
  Check it out.
 
 Is this ok:
 !!! All ebuilds that could satisfy porthole have been masked.
 Or is there any, which is not masked?

It is masked because of a gtk bug that will segfault if you expand the
Dependencies listing in the upgradeable view.  It will segfault when
you return to any other view unless you make it re-sort the list or make
it rebuild the list.  It is something that did not occur in earlier
versions of gtk.  Actually earlier versions of porthole were much more
unstable and segfaulted due to numerous other coding errors that were
difficult to track down, but were not masked.

Currently the only fix is to re-code porthole to use the treeviews
differently (a fairly major undertaking I do not have time for yet).
Currently each view has it's own model and we switch the models for the
treeview.  The other way would be to have only one model and clear the
model and re-populate it with different data when switching views.  That
is probably the better way to do it in the long run, especially if
someone was to make a KDE interface for it.  That way it would be much
easier to use either a GTK or KDE interface.

 
 And -- if portage is meant as main engine and porthole as it's gui,
 isnt it a bit fuzzy to add speed-ups to porthole instead of portage?
 If it continues like that, it may end up with someone writing
 command-line tool for controlling porthole :P I think that if
 application has 2 layers, one for logic and another for GUI, then it's
 maybe not the best way of coding to add such kind of features to GUI
 part of package. I personally would definitely try to make portage
 itself support indexing and other such stuff to keep things clean. Am
 i wrong? Or is it in plans to make gentoo a GUI linux with very weak
 command-line support?
 
 I think that GUI code would be *clean* if it's just a GUI!
 

If you can get it implemented in portage so much the better.  If not, I
have had feature requests to add it to porthole to speed up description
searching. (many users use porthole and command line tools)  It can be
slow currently in porthole which does not get descriptions for searches
unless enabled, then it fetches all descriptions.  After that once
loaded searches are quick again.  Adding support for the esearch db
would speed that up since a db is already created and hopefully already
updated.

-- 
Brian [EMAIL PROTECTED]

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
I didnt think of case Item1, Item2, Item3.

I thought of cases, for example, where i use Id field as TableName and
IdInThatTable, where TableName shows, which table this IdInThatTable
points and so on. I dont use, too, Item1/2/3 :) I just use tables
sometimes in a more generalized form, where it's hard to say from
table name or fields, what it is supposed to contain, as it contains
different things in different cases -- therefore allowing me to make
more functionality to general datastructures rather than writing
specific tables with specific functions. Anyway, this is an example --
i just think that normalizing makes it so that there is 1 way to do
things, but i like to rethink any specific case (and in some cases,
normalized table just appears best to me, but not because it's
normalized). Nothing more. Anyway, i think that this is not a topic to
discuss in this list :) I think that db-app otimizations was best
argument ever possible on side of normalization -- others are those,
which will appear to me, too, but i havent much thought about which
db's are optimized to which structures -- and this seems so that as
normalization is in, any engines probably really are optimized for
that.

2006/3/14, Johannes Fahrenkrug [EMAIL PROTECTED]:
 tvali wrote:

 I will consider what you sayd about db app design.
 
 Anyway, i usually try to keep tables more dynamic and look at task at
 hand, trying to make tables specially for it. When i tested
 normalizing, i got about 60 tables where i had 5 without normalizing.
 
 
 I'm not a Gentoo dev, but a programmer who deals with software and db
 design issues every day.
 Normalizing your data structures keeps them - and the apps that use them
 - flexible.

 Of course a table with fields like customernr, customername, item1,
 item2, item3 is easier to create and smaller
 than one table for the customers and one for items. But what if there's
 a 4th and a 5th item? You have to change
 your table and every place in your app that uses it (which should only
 be one).

 I assume you're also not too fond of design patterns because some
 require you to create 5 classes for something you could do with one ;-)...

 - Johannes.


 --
 gentoo-portage-dev@gentoo.org mailing list




--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
2006/3/14, solar [EMAIL PROTECTED]:
 On Tue, 2006-03-14 at 15:50 +0200, Marius Mauch wrote:
  Heh, make the dep resolver even more complex ;)
  Also don't really see a need for such a feature, pretty much no benefit
  with a lot of additional complexity.

 agreed.

Ok, but if making it layered (several layers or passes). To explain
my thought about how they should be sorted:

Layer 1 would calculate real dependencies of packages -- using current
use flags. Those dependencies would also be saved into portage tree
after calculating (--newuse would update them).

Layer 2 would be used to make a list of all packages, which would be
installed right now -- this happens when some emerge or pretend is
being done.

Layer 3 would only use those calculated dependencies and
ready-selected list of packages and sort them.

One possible sort would look like that:

* Packages, which have given importance, will have this importance
forever -- it will be written to world.
* When sorting packages, only those, which are directly in world, will
be considered -- dependancies are installed as late as possible.

It should be enough:
1. Make list
2. Add all packages, starting with those, which have higher priority
3. Add their dependencies, trying to put any of them right before the
first package, which needs them (more advanced sorting would use also
default priorities of dependancies -- putting them before first
package, which needs them, and after last package, which has bigger
importance than this first package)

Any other sort optimizations would be unnessecary as user can just use
right priority numbers to get everything ok with this type of sorting.

Sorting is good especially when user makes a long list of packages and
then starts to emerge them at once.

In my case i did install gentoo only at daytime, having my computer
off at nights (as i dont like it's sound in my room when i sleep). It
took several days -- and i started to use it right after getting it
up; i started also using kde right after it included kicker, konqueror
and konsole. I wanted to install graphics, toys and other such things
as last thing. And i didnt want to make pauses between emerges, but
had, because i wasnt always there, when one emerge was finished (as
they took hours).

--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread tvali
Ok, i think i have reasonable amount of information to start something
now :) As a last question -- what i do after some additions to
portage? Send it where?

2006/3/14, tvali [EMAIL PROTECTED]:
 2006/3/14, solar [EMAIL PROTECTED]:
  On Tue, 2006-03-14 at 15:50 +0200, Marius Mauch wrote:
   Heh, make the dep resolver even more complex ;)
   Also don't really see a need for such a feature, pretty much no benefit
   with a lot of additional complexity.
 
  agreed.

 Ok, but if making it layered (several layers or passes). To explain
 my thought about how they should be sorted:

 Layer 1 would calculate real dependencies of packages -- using current
 use flags. Those dependencies would also be saved into portage tree
 after calculating (--newuse would update them).

 Layer 2 would be used to make a list of all packages, which would be
 installed right now -- this happens when some emerge or pretend is
 being done.

 Layer 3 would only use those calculated dependencies and
 ready-selected list of packages and sort them.

 One possible sort would look like that:

 * Packages, which have given importance, will have this importance
 forever -- it will be written to world.
 * When sorting packages, only those, which are directly in world, will
 be considered -- dependancies are installed as late as possible.

 It should be enough:
 1. Make list
 2. Add all packages, starting with those, which have higher priority
 3. Add their dependencies, trying to put any of them right before the
 first package, which needs them (more advanced sorting would use also
 default priorities of dependancies -- putting them before first
 package, which needs them, and after last package, which has bigger
 importance than this first package)

 Any other sort optimizations would be unnessecary as user can just use
 right priority numbers to get everything ok with this type of sorting.

 Sorting is good especially when user makes a long list of packages and
 then starts to emerge them at once.

 In my case i did install gentoo only at daytime, having my computer
 off at nights (as i dont like it's sound in my room when i sleep). It
 took several days -- and i started to use it right after getting it
 up; i started also using kde right after it included kicker, konqueror
 and konsole. I wanted to install graphics, toys and other such things
 as last thing. And i didnt want to make pauses between emerges, but
 had, because i wasnt always there, when one emerge was finished (as
 they took hours).

 --
 tvali
 (e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
 icq: 317-492-912)

 Ühe eesti internetifirma lehel kohtasin tsitaati:
 If you don't do it excellently, dont do it at all. Because if it's not
 excellent, it won't be profitable or fun, and if you're not in
 business for fun or profit, what the hell are you doing here?
 Robert Townsend



--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Grant Goodyear
tvali wrote:
 Ok, i think i have reasonable amount of information to start something
 now :) As a last question -- what i do after some additions to
 portage? Send it where?

bugs.gentoo.org



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] Few things, which imho would make portage better

2006-03-14 Thread Grant Goodyear
tvali wrote:
 And if it's not a bugfix? Still there?

Yes, please.  We use bugs.gentoo.org to track nearly everything Gentoo
related.

-g2boojum-



signature.asc
Description: OpenPGP digital signature


esearch integration [was Re: [gentoo-portage-dev] Few things, which imho would make portage better]

2006-03-14 Thread Brian Harring
On Tue, Mar 14, 2006 at 04:33:06PM +0200, tvali wrote:
 I did think about it now and it seems to me that probably it would be
 much faster if esearch is not just another package, but part of
 portage.
 
 I mean -- functions of portage, which query db, should use esearch
 index wherever they need information, which exists in that index.
 
 As much as i can understand, /var/cache/edb/ contains esearch database
 in many files and esearchdb.py is search index as python script.

No...
esearch is a static db- only useful for 'frozen' trees, eg rsync 
distributed trees with no eclasses in overlays.  All cvs users (devs) 
run unfrozen trees (readonly/readwrite is better terminology), thus 
portage updates the cache db on the fly as needed.

If esearch was integrated into portage the result would be stale 
metadata for cvs users, and stale metadata for rsync users when 
overlays with eclasses are involved- no go.

That and esearch last I looked just generates a giant dict (thus the 
cache is in memory), which kind of blows the 25mb mem usage 2.1 
now sports :)

~harring


pgpjlCAV4I3Za.pgp
Description: PGP signature


sync suggestions [was Re: [gentoo-portage-dev] Few things, which imho would make portage better]

2006-03-14 Thread Brian Harring
On Tue, Mar 14, 2006 at 03:50:18PM +0200, tvali wrote:
 Another question now is about sync.
 
 I did read somewhere, that this is not good user behavior to sync more
 than once per day. I understand that as if this is a huge download
 even if there is nothing changed.
 
 Isnt it nice idea to have this database just optimized?
 
 I mean (assuming portage using SQL now) -- that would be really simple
 to log every change in portage tree as series of SQL queries, which
 would reproduce this change.

Pushing the delta (what you're suggesting) is only usable if it can be 
guranteed the user hasn't modified their tree at all (thus resulting 
in cache db differing from upstreams).

That right there is the brass tacks of it; You wouldn't be able to 
push just the changes, you would have to regenerate the _whole_ db 
(slow, 20k inserts assuming only one table).

Sidenote... please post seperate threads for seperate 
ideas/discussions, else it's damn hard to look back and pull the 
specific thread were something was discussed.
~harring



pgpmJPQ6iTqcg.pgp
Description: PGP signature