[fossil-users] A large repo import suceeded!

2010-12-02 Thread Kumar
I've been using fossil for my personal work for a few months now,
but never had the chance to try it on a large project. After seeing the
new git import feature, I decided to give it a shot and .. whoa!

My work git repo is 2.8GB in size. During the first stage of the import,
the fossil file went to about 8.4GB, but after the vacuuming, it shrank
to the same 2.8GB. Sweet!

Some questions -

Q1: Rgd git import - my repo has git-svn branch references of the form
svnroot/BranchName. These have now been imported into fossil as
branches with name BranchName. Is the svnroot/ info preserved
somewhere in the repo?

Q2: fossil ui is nice. However, I'm missing search - like what git gui
gives. What do fossil-ites use for searching the repo? Just straight
database search?

Cheers,
-Kumar
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Zed A. Shaw
On Thu, Dec 02, 2010 at 05:06:03PM +0800, Kumar wrote:
 My work git repo is 2.8GB in size. During the first stage of the import,
 the fossil file went to about 8.4GB, but after the vacuuming, it shrank
 to the same 2.8GB. Sweet!

Ok, I gotta ask, what is *in* your repo that it's 2.8 GIGABYTES in size?
Is it really 2.8 thousand 1M git repos all screaming to get out? :-)


-- 
Zed A. Shaw
http://zedshaw.com/
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Kumar
:D I knew someone would ask that!

Its not all source (as if I had to say that). There are some media
files tracked as well (images, audio clips and video clips)
that are part of the shipping product .. and about 3 years worth
history of those.

-Kumar

On Thu, Dec 2, 2010 at 5:08 PM, Zed A. Shaw zeds...@zedshaw.com wrote:

 On Thu, Dec 02, 2010 at 05:06:03PM +0800, Kumar wrote:
  My work git repo is 2.8GB in size. During the first stage of the import,
  the fossil file went to about 8.4GB, but after the vacuuming, it shrank
  to the same 2.8GB. Sweet!

 Ok, I gotta ask, what is *in* your repo that it's 2.8 GIGABYTES in size?
 Is it really 2.8 thousand 1M git repos all screaming to get out? :-)


 --
 Zed A. Shaw
 http://zedshaw.com/

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Richard Hipp
On Thu, Dec 2, 2010 at 4:06 AM, Kumar srikuma...@gmail.com wrote:

 I've been using fossil for my personal work for a few months now,
 but never had the chance to try it on a large project. After seeing the
 new git import feature, I decided to give it a shot and .. whoa!

 My work git repo is 2.8GB in size. During the first stage of the import,
 the fossil file went to about 8.4GB, but after the vacuuming, it shrank
 to the same 2.8GB. Sweet!


FWIW, I'm thinking I should rename the import command to git-import - in
order to allow future expansion with other import schemes.  Similarly,
export should be renamed git-export.  Or, maybe there is a secondary
command to specify the target, for example:

   fossil import git
   fossil import svn
   fossil import hg

And so forth   Thoughts?



 Some questions -

 Q1: Rgd git import - my repo has git-svn branch references of the form
 svnroot/BranchName. These have now been imported into fossil as
 branches with name BranchName. Is the svnroot/ info preserved
 somewhere in the repo?


I was not real clear what should be done with those tags, so I stripped off
all but the last component of the tag.  Is that not the right thing to do?
Please explain



 Q2: fossil ui is nice. However, I'm missing search - like what git gui
 gives. What do fossil-ites use for searching the repo? Just straight
 database search?


SQLite has a great full-text search engine built in.  I've long thought that
it would be great to add an interface to this in Fossil.  We could index
diffs for all check-ins, all wiki, all tickets, all Blog entries, etc, and
then have a Google-like interface for searching for things.  Of course, the
full text index would likely double the size of the repository file, but in
this era of TB-size disk drives, is that really an issue?

It's really more of a matter of finding the time to do the necessary
hacking




 Cheers,
 -Kumar





 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Joerg Sonnenberger
On Thu, Dec 02, 2010 at 09:15:55AM -0500, Richard Hipp wrote:
 FWIW, I'm thinking I should rename the import command to git-import - in
 order to allow future expansion with other import schemes.  Similarly,
 export should be renamed git-export.  Or, maybe there is a secondary
 command to specify the target, for example:

Strictly speaking, it is git-fast-import as that's the input stream it
is reading. Either that or git-import looks best too me.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Lluís Batlle i Rossell
On Thu, Dec 02, 2010 at 09:15:55AM -0500, Richard Hipp wrote:
 
  Q2: fossil ui is nice. However, I'm missing search - like what git gui
  gives. What do fossil-ites use for searching the repo? Just straight
  database search?
 
 
 SQLite has a great full-text search engine built in.  I've long thought that
 it would be great to add an interface to this in Fossil.  We could index
 diffs for all check-ins, all wiki, all tickets, all Blog entries, etc, and
 then have a Google-like interface for searching for things.  Of course, the
 full text index would likely double the size of the repository file, but in
 this era of TB-size disk drives, is that really an issue?
 
 It's really more of a matter of finding the time to do the necessary
 hacking
I'm happy to hear that!

I'd leave that as an option, thought, about doubling or not doubling the
repository file.

Having the option of a search without indexing, is feasible?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Gour
On Thu, 2 Dec 2010 15:33:26 +0100
 Joerg == Joerg Sonnenberger wrote:

Joerg Strictly speaking, it is git-fast-import as that's the input
Joerg stream it is reading. Either that or git-import looks best too
Joerg me.

+1

Moreover, I do not think there is need for e.g. git import svn
etc. considering that git's fast-import is de-facto standard for
exchanging changesets among different DVCS-s.


Sincerely,
Gour

-- 

Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA



signature.asc
Description: PGP signature
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Ramon Ribó
   Hello,

    fossil import git
    fossil import svn
    fossil import hg

 And so forth   Thoughts?

In my opinion, this is the most correct option for several reasons:

1- Do not pollute the global namespace

2- Make it easier for everyone that is not going to use the option
(most of the users), and need to learn fast the fossil suboptions

3- The important actions is import, the detail is from where

About the full text search:

- Size is important. At least for some uses. It is common to copy
fossil repositories as files from one computer to another. At the same
time, one repository can have multiple copies.

One possibility could be:

1- The first time that the search is going to be used, require the
index to be created

2- Create it outside of the fossil repository, either in a directory
previously open or in the .fossil or something similar

3- Show all the results

  Regards,

RR


2010/12/2 Richard Hipp d...@sqlite.org:


 On Thu, Dec 2, 2010 at 4:06 AM, Kumar srikuma...@gmail.com wrote:

 I've been using fossil for my personal work for a few months now,
 but never had the chance to try it on a large project. After seeing the
 new git import feature, I decided to give it a shot and .. whoa!
 My work git repo is 2.8GB in size. During the first stage of the import,
 the fossil file went to about 8.4GB, but after the vacuuming, it shrank
 to the same 2.8GB. Sweet!

 FWIW, I'm thinking I should rename the import command to git-import - in
 order to allow future expansion with other import schemes.  Similarly,
 export should be renamed git-export.  Or, maybe there is a secondary
 command to specify the target, for example:

    fossil import git
    fossil import svn
    fossil import hg

 And so forth   Thoughts?


 Some questions -
 Q1: Rgd git import - my repo has git-svn branch references of the form
 svnroot/BranchName. These have now been imported into fossil as
 branches with name BranchName. Is the svnroot/ info preserved
 somewhere in the repo?

 I was not real clear what should be done with those tags, so I stripped off
 all but the last component of the tag.  Is that not the right thing to do?
 Please explain


 Q2: fossil ui is nice. However, I'm missing search - like what git
 gui
 gives. What do fossil-ites use for searching the repo? Just straight
 database search?

 SQLite has a great full-text search engine built in.  I've long thought that
 it would be great to add an interface to this in Fossil.  We could index
 diffs for all check-ins, all wiki, all tickets, all Blog entries, etc, and
 then have a Google-like interface for searching for things.  Of course, the
 full text index would likely double the size of the repository file, but in
 this era of TB-size disk drives, is that really an issue?

 It's really more of a matter of finding the time to do the necessary
 hacking



 Cheers,
 -Kumar




 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




 --
 D. Richard Hipp
 d...@sqlite.org

 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Kumar

fossil import git
fossil import svn
fossil import hg

 And so forth   Thoughts?


If the format of the input stream can be auto-detected, then it can simply
be fossil import can't it? You'll only need to specify format explicitly
for the export I think.

Q1: Rgd git import - my repo has git-svn branch references of the form
 svnroot/BranchName. These have now been imported into fossil as
 branches with name BranchName. Is the svnroot/ info preserved
 somewhere in the repo?


 I was not real clear what should be done with those tags, so I stripped off
 all but the last component of the tag.  Is that not the right thing to do?
 Please explain


Oh! In my case, the git branches named BranchName were tracking the
branches named svnroot/BranchName and so I didn't have a problem, but in
the general case I can imagine a collision if you just use the last term. As
long as both branches are pointing to the same commit, there is no collision
issue and you can drop one, but if they aren't then it'll help if the full
branch name is retained - like svnroot/BranchName itself. So, ok,
fossil-git is not an exact inverse of git-fossil.

SQLite has a great full-text search engine built in.  I've long thought that
 it would be great to add an interface to this in Fossil.  We could index
 diffs for all check-ins, all wiki, all tickets, all Blog entries, etc, and
 then have a Google-like interface for searching for things.  Of course, the
 full text index would likely double the size of the repository file, but in
 this era of TB-size disk drives, is that really an issue?


Such a search interface would be awesome! I don't think the size increase is
an issue at all. The only case may be backups, but even they can be done
using cloning - that would mean that the search index shouldn't sync.

Cheers,
-Kumar
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Stephan Beal
On Thu, Dec 2, 2010 at 4:36 PM, Ramon Ribó ram...@compassis.com wrote:

 About the full text search:

 - Size is important. At least for some uses. It is common to copy
 fossil repositories as files from one computer to another. At the same
 time, one repository can have multiple copies.

 One possibility could be:
 ...
 2- Create it outside of the fossil repository, either in a directory
 previously open or in the .fossil or something similar


A related option: allow a FTS only a checked-out copy of the tree, where we
already have extra files like _FOSSIL_. Build the FTS during 'fossil
open', update it during 'update' ops, and nuke it during 'fossil close'?

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Richard Hipp
On Thu, Dec 2, 2010 at 10:21 AM, Gour g...@atmarama.net wrote:

 On Thu, 2 Dec 2010 15:33:26 +0100
  Joerg == Joerg Sonnenberger wrote:

 Joerg Strictly speaking, it is git-fast-import as that's the input
 Joerg stream it is reading. Either that or git-import looks best too
 Joerg me.

 +1

 Moreover, I do not think there is need for e.g. git import svn
 etc. considering that git's fast-import is de-facto standard for
 exchanging changesets among different DVCS-s.


There are limitations with fast-import.  For example, when Fossil exports to
the fast-import format, it has to omit all information about wiki and
tickets and blog entries, because there is no provision for that in the
format.  So I'm thinking that fast-import is not the final word in DVCS
interchange formats.  Fossil should be prepared for the next interchange
format, whatever that format might turn out to be




 Sincerely,
 Gour

 --

 Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA
 

 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Nolan Darilek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/02/2010 09:36 AM, Ramon Ribó wrote:
 
 In my opinion, this is the most correct option for several reasons:
 
 1- Do not pollute the global namespace
 
 2- Make it easier for everyone that is not going to use the option
 (most of the users), and need to learn fast the fossil suboptions
 
 3- The important actions is import, the detail is from where
 

I agree with this for all the reasons stated. If import/export will only
do git, then it'd make sense to rename them. If they may do other
formats in the future, then I think that a subcommand would be best.

One thing that I really like about Fossil is that fossil help fits on
a single screen and includes *all* commands (other than the test ones
that aren't really needed anyway), whereas git help uses two screens
and only includes the most common commands. And even then, git help
doesn't include some commands I regularly need. I realize that's mostly
cosmetic, and I won't spend much time painting this particular bikeshed,
but import/export aren't common operations, and having a bunch of
top-level commands just for that purpose would seem to crowd that
display even further and introduce command creep.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkz3wlUACgkQIaMjFWMehWJFZgCbB7TbEFzcBW5Y4jDJC9/VA4Cj
GxMAn2VmnODlfSDRMqMK8Q/UQtfhPhB0
=jS1j
-END PGP SIGNATURE-

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread David Shinn
SQLite has a great full-text search engine built in.  I've long thought that
 it would be great to add an interface to this in Fossil.  We could index
 diffs for all check-ins, all wiki, all tickets, all Blog entries, etc, and
 then have a Google-like interface for searching for things.  Of course, the
 full text index would likely double the size of the repository file, but in
 this era of TB-size disk drives, is that really an issue?


Such a search interface would be awesome! I don't think the size increase is
an issue at all. The only case may be backups, but even they can be done
using cloning - that would mean that the search index shouldn't sync.

Because fossil stores artifacts as zlib compressed deltas, full text searching
using the existing fossil SQLite database is not possible.  One would
either have
to create a separate database containing the full set of artifacts or create a
program to reconstruct each artifact of the database and do a search on each
artifact separately.  Does anybody have a better idea?

David Shinn
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] A large repo import suceeded!

2010-12-02 Thread Kumar
David Shinn wrote:

 Because fossil stores artifacts as zlib compressed deltas, full text
 searching
 using the existing fossil SQLite database is not possible.  One would
 either have
 to create a separate database containing the full set of artifacts or
 create a
 program to reconstruct each artifact of the database and do a search on
 each
 artifact separately.  Does anybody have a better idea?


Repo search is really two problems I think -

a) Search the artifacts - i.e content - with revision specified. This
can be equivalent in cost to checking out a version and running a file grep.

b) Searching the commit logs. Maybe only the commit logs can be stored
uncompressed (maybe indexed as well) in order to enable some searching.
That'll get us search by text/author/date/files-changed.

Thots?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users