Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Mike Frysinger
On Thursday 15 December 2011 00:39:44 Nirbheek Chauhan wrote:
 On Thu, Dec 15, 2011 at 5:28 AM, Mike Frysinger vap...@gentoo.org wrote:
  On Wednesday 14 December 2011 18:43:33 Alec Warner wrote:
  On Wed, Dec 14, 2011 at 1:25 PM, Leho Kraav l...@kraav.com wrote:
   i'd be really happy if someone took care of
   https://bugs.gentoo.org/150031 :
   
   include more info about binpkg in file name
  
  That is great, but its not a 6 month project...
  
  is it though ?  i'm inclined to mark INVALID.  hijacking filenames for
  metadata is a tuuurrible idea.
 
 I agree. It's along the same lines as only using file extensions for
 determining the filetype (and we all know how that turned out...). It
 *does* have the advantage of being really fast, though.

it just doesn't scale though (encoding all metadata into the filename quickly 
hits filesystem limits on name length), and i think the speed increase is only 
to a limit.  once you get into larger repos, using the already existing 
Packages file would be faster.  and since that compresses, it should scale a 
lot nicer.

 Nevertheless, the basic bug is about changing the distfile repository
 format in such a way that a single repo can contain several distfiles
 built with differing build conditions. Putting metadata in the
 filename is only one way of ensuring that.

sounds like the summary needs updating then by someone who has waded through 
all the followup comments ;)
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Nirbheek Chauhan
On Thu, Dec 15, 2011 at 5:57 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Thursday 15 December 2011 00:39:44 Nirbheek Chauhan wrote:
 Nevertheless, the basic bug is about changing the distfile repository
 format in such a way that a single repo can contain several distfiles
 built with differing build conditions. Putting metadata in the
 filename is only one way of ensuring that.

 sounds like the summary needs updating then by someone who has waded through
 all the followup comments ;)

I didn't read every word, but I think I got the gist. I've changed the
subject accordingly. :)

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Rich Freeman
On Thu, Dec 15, 2011 at 12:39 AM, Nirbheek Chauhan nirbh...@gentoo.org wrote:
 Nevertheless, the basic bug is about changing the distfile repository
 format in such a way that a single repo can contain several distfiles
 built with differing build conditions. Putting metadata in the
 filename is only one way of ensuring that.

Well, having the filename vary when the metadata changes is the only
way of ensuring that.  Putting the metadata in the filename is just
one of many ways to make the filename vary.

Another solution (which I can already sense the objections to), would
be to content-hash the files and use that as the filename.  Then use
indexes to point to the files.  You could use symlink indexes to point
to the files so that superficially it looks the same as it does now
for the last version emerged.  Then people looking for a particular
set of metadata could use more detailed indexes to find the right
file.  Portage could look for an exact match when trying to merge a
binpkg since searching indexes is a trivial problem.

The indexes could be anything from text files to binary files to
databases to a couple of directory trees full of symlinks (like
/dev/disk/by-*).  The symlinks could get tricky with all the metadata
- it might make more sense to just keep it simple and use something
more like a database for the full details and symlinks for the basics.

There are countless variations on this as well - like sticking a copy
of the environment for each package in a separate text file with the
same base name so that it is easy to grep/search/etc.

You can also make it more user-friendly by keeping the PF in the
filename followed by the hash - like gvim-1.23-r1-723ba298d92f.  In
such a case you probably don't even need to index the PFs.

Rich



Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Mike Frysinger
On Thursday 15 December 2011 07:43:26 Rich Freeman wrote:
 On Thu, Dec 15, 2011 at 12:39 AM, Nirbheek Chauhan wrote:
  Nevertheless, the basic bug is about changing the distfile repository
  format in such a way that a single repo can contain several distfiles
  built with differing build conditions. Putting metadata in the
  filename is only one way of ensuring that.
 
 Well, having the filename vary when the metadata changes is the only
 way of ensuring that.  Putting the metadata in the filename is just
 one of many ways to make the filename vary.

there is more raw metadata available than fits into a filename.  so that is 
already a non-starter.

if people want to post multiple binpkgs with different metadata

 Another solution (which I can already sense the objections to), would
 be to content-hash the files and use that as the filename.  Then use
 indexes to point to the files.  You could use symlink indexes to point
 to the files so that superficially it looks the same as it does now
 for the last version emerged.  Then people looking for a particular
 set of metadata could use more detailed indexes to find the right
 file.  Portage could look for an exact match when trying to merge a
 binpkg since searching indexes is a trivial problem.

we already hash all the files (i.e. the CONTENTS file), so using the hash of 
that file alone wouldn't be a bad idea.  although i think that file gets 
generated on the fly when merging the binpkg (seems like a waste to not cache 
that in the binary package ...).

 There are countless variations on this as well - like sticking a copy
 of the environment for each package in a separate text file with the
 same base name so that it is easy to grep/search/etc.

the env is already trivial to extract:
qtbz2 -x -O bison-2.5.tbz2 | qxpak -O -x - environment.bz2 | bzgrep foo

 You can also make it more user-friendly by keeping the PF in the
 filename followed by the hash - like gvim-1.23-r1-723ba298d92f.  In
 such a case you probably don't even need to index the PFs.

portage should be using the generated Packages index to look up actual tbz2 
filenames, so having ${PF}-hash.tbz2 shouldn't be too painful.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Mike Frysinger
On Thursday 15 December 2011 11:30:46 Mike Frysinger wrote:
 if people want to post multiple binpkgs with different metadata

err, half formed thought here ... ignore
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Gaurav Saxena
Hello all , Thanks a lot for your replies.

Christian, I am interested in Open RC, it sounds interesting to me, I
would like to know more details regarding what type of projects are
there that could be done.

On Wed, Dec 14, 2011 at 11:35 PM, Christian Ruppert id...@gentoo.org wrote:
 On Wednesday 14 December 2011 16:36:42 Gaurav Saxena wrote:
 Hello all,
 I am interested in doing my final year computer scence project on gentoo. I
 would be having a duration of six months to work on the project. Could you
 please suggest me some good project ideas that would be helpful to me as
 well as gentoo. I am interested in parallel computing, data structures ,
 operating system. I am well versed in C/C++. I think  there might be
 projects which need to be done, I would like to work on them.

 What about OpenRC? :)
 We could need some help.
 http://www.gentoo.org/proj/en/base/openrc/
 #openrc or #gentoo-base (IRC) via FreeNode
 Or ope...@gentoo.org.

 Let me know if you're interested or need more details :)

 --
 Regards,
 Christian Ruppert
 Gentoo Linux developer, Bugzilla administrator and Infrastructure member
 Fingerprint: EEB1 C341 7C84 B274 6C59 F243 5EAB 0C62 B427 ABC8



-- 
Thanks and Regards ,
Gaurav



Re: [gentoo-dev] Six month major project on Gentoo

2011-12-15 Thread Gaurav Saxena
Hello Alec,
I am interested in a distributed compiler like this ,this involves
quite a good project it seems. I would like to work on it, if possible
could you please give me some pointers regarding more details about
the project so that I could decide what to work upon.

On Wed, Dec 14, 2011 at 11:59 PM, Alec Warner anta...@gentoo.org wrote:
 On Wed, Dec 14, 2011 at 3:06 AM, Gaurav Saxena grvsaxena...@gmail.com wrote:
 Hello all,
 I am interested in doing my final year computer scence project on gentoo. I
 would be having a duration of six months to work on the project. Could you
 please suggest me some good project ideas that would be helpful to me as
 well as gentoo. I am interested in parallel computing, data structures ,
 operating system. I am well versed in C/C++. I think  there might be
 projects which need to be done, I would like to work on them.

 The only idea I can think of for parallel computing / distributed
 systems would be at the build level.

 distcc-ng, a farm of user-controlled machines that compile your code
 in a p2p fashion.
 a distributed hash table of input, output tuples (basically .o caching
 so users can fetch the .o from the DHT)

 Both of these have *massive* trust issues. When random guys on the
 internet are compiling your code you have to be very careful about how
 you verify and execute that code. When you fetch .o files from a DHT
 you have the same problem.

 Almost every other problem I can think of at the Gentoo OS level can
 fit on a meager sized machine (i.e. it is not a distributed systems
 nor a parallel computing problem.)

 Many of the annoying parts of Gentoo are merely tools problems; the
 existing tools are poor / under-maintained or standard tools do not
 exist (so users / developers roll their own.) You may have success in
 the tools arena if you talk to mgorny or portage-utils@; I know mgorny
 has written a few C tools and might have sufficient 'gentoo' C
 libraries you could utilize; the portage-utils alias holds the
 portage-utils authors (portage-utils being another set of tools
 written in C for gentoo.)

 I actually liked cbergstrom's idea of toolchain-type stuff; but I'm
 not really sure how easy it is to on-board with those communities
 (lord knows in my senior year of CS I would have been useless working
 on a compiler.)


 --
 Thanks and Regards ,
 Gaurav




-- 
Thanks and Regards ,
Gaurav



[gentoo-dev] About removing secure-tunneling herd

2011-12-15 Thread Pacho Ramos
Looking http://www.gentoo.org/proj/en/metastructure/herds/herds.xml I
have seen secure-tunneling looks empty, also looks to have no bugs
assigned to it currently and I couldn't find its mail alias. 
OK with removing it?

Thanks



signature.asc
Description: This is a digitally signed message part


[gentoo-dev] About removing utf8 herd

2011-12-15 Thread Pacho Ramos
Hello

Looking at bugs assigned to that herd, seems that most of them are obsolete or
should probably be handled by package maintainers instead of a different
herd. 

Maybe utf8 herd should disappear, are you ok with that or do you prefer
to keep it alive?

Thanks




signature.asc
Description: This is a digitally signed message part


[gentoo-dev] Re: estack_{push,pop}: cool new helpers or over engineering?

2011-12-15 Thread Steven J Long
Just to point out that arithmetic context can be more efficient; no bugs, 
except for a /minor/ possibility (second last comment.)

Mike Frysinger wrote:
 --- eutils.eclass 14 Dec 2011 17:36:18 -  1.372
 +++ eutils.eclass 14 Dec 2011 23:46:37 -
 @@ -100,6 +100,54 @@ esvn_clean() {
  find $@ -type d -name '.svn' -prune -print0 | xargs -0 rm -rf
  }
  
 +# @FUNCTION: estack_push
 +# @USAGE: stack [items to push]
 +# @DESCRIPTION:
 +# Push any number of items onto the specified stack.  Pick a name that
 +# is a valid variable (i.e. stick to alphanumerics), and push as many
 +# items as you like onto the stack at once.
 +#
 +# The following code snippet will echo 5, then 4, then 3, then ...
 +# @CODE
 +#estack_push mystack 1 2 3 4 5
 +#while estack_pop mystack i ; do
 +#echo ${i}
A minor #bash point in passing: although these values of i are safe, for 
tutorial code, I really would recommend quoting: echo $i (or ${i}). It's 
better to get people used to quoting by default, and only not quoting iff 
they need field-splitting on parameter expansions (eg for a variable used 
for command options.)

 +#done
 +# @CODE
 +estack_push() {
 + [[ $# -eq 0 ]]  die estack_push: incorrect # of arguments
 + local stack_name=__ESTACK_$1__ ; shift
 + eval ${stack_name}+=\( \\$@\ \)
 +}
((..)) is quicker than [[ .. ]] for arithmetic stuff, and usually easier to 
grok swiftly.
(($#)) || die .. is how this would normally be done.

 +
 +# @FUNCTION: estack_pop
 +# @USAGE: stack [variable]
 +# @DESCRIPTION:
 +# Pop a single item off the specified stack.  If a variable is specified,
 +# the popped item is stored there.  If no more items are available,
 return
 +# 1, else return 0.  See estack_push for more info.
 +estack_pop() {
 + ( [[ $# -eq 0 ]] || [[ $# -gt 2 ]] )  die estack_pop: incorrect 
# of arguments

(($# == 0 || $#  2))  die.. # does it in one command, with no subshell.
[[ $# -eq 0 || $# -gt 2 ]]  die .. would work too, but more slowly.
In general if you want to do complex chains without a subshell, you would 
use: { }  .. instead of: ( )  ..

TBH I would type (($#==0||$#2)) in bash, though I space in C, where it 
doesn't affect execution time. But it's not as clear, especially if you're 
not in a highlighting editor.

 + # We use the fugly __estack_xxx var names to avoid collision with
 + # passing back the return value.  If we used local i and the
 + # caller ran `estack_pop ... i`, we'd end up setting the local
 + # copy of i rather than the caller's copy.  The __estack_xxx
 + # garbage is preferable to using $1/$2 everywhere as that is a
 + # bit harder to read.
 + local __estack_name=__ESTACK_$1__ ; shift
 + local __estack_retvar=$1 ; shift
 + eval local __estack_i=\${#${__estack_name}[@]}
 + # Don't warn -- let the caller interpret this as a failure
 + # or as normal behavior (akin to `shift`)
 + [[ $(( --__estack_i )) -eq -1 ]]  return 1
((--__estack_i == -1))  ..

 +
 + if [[ -n ${__estack_retvar} ]] ; then
 + eval ${__estack_retvar}=\\${${__estack_name}
[${__estack_i}]}\
 + fi
 + eval unset ${__estack_name}[${__estack_i}]
 +}
 +
  # @FUNCTION: eshopts_push
  # @USAGE: [options to `set` or `shopt`]
  # @DESCRIPTION:
 @@ -126,15 +174,14 @@ esvn_clean() {
  eshopts_push() {
  # have to assume __ESHOPTS_SAVE__ isn't screwed with
  # as a `declare -a` here will reset its value
 - local i=${#__ESHOPTS_SAVE__[@]}
  if [[ $1 == -[su] ]] ; then
 - __ESHOPTS_SAVE__[$i]=$(shopt -p)
 + estack_push eshopts $(shopt -p)
  [[ $# -eq 0 ]]  return 0
I'm not sure how this will ever match, given that $1 has been checked above?
(($#==1))  return 0 # if that applies (might be a 'bug'.)

  shopt $@ || die eshopts_push: bad options to shopt: $*
  else
 - __ESHOPTS_SAVE__[$i]=$-
 + estack_push eshopts $-
  [[ $# -eq 0 ]]  return 0
(($#)) || return 0

  set $@ || die eshopts_push: bad options to set: $*
  fi
  }
  

HTH,
Steve.
-- 
#friendly-coders -- We're friendly, but we're not /that/ friendly ;-)