Hey Alan
see inline
(ccing opengrok-dev, just for having this info sent to more dev folks in
case they want to chime in)
On 8.2.2011 18:31, ALAN KAPLAN, BLOOMBERG/ 731 LEXIN wrote:
hi Lubos, I was wondering if you could answer these questions for me?
1. When you remove something from your source (ie. an entire directory), when
i did the index it said it was removing a bunch of stale files, and it created
the index. So should I assume that the index no longer contains this info, for
the deleted directory? however, there is an entry in the historycache dir, and
the index dir. can I safely just remove that? It seems this doesn't get cleaned
up? please advise how to proceed here. thanks.
if it's a top folder("project") and
if the folder doesn't exists in src, then feel free do remove it
if it's a subdirectory of a project and if it also doesn't exist in src
anymore,
you can do the same imho (eventually for the time being, move it aside,
to be safe ... if no probs will arise in a week, then delete)
afaik opengrok will just remove the files from lucene index (hence you
will not be able to search for them), but all the cache (xrefs and
history) will probably stay behind - feel free to file a bug on this
2. I see there is a new OpenGrok version .10, can I just install it and run
instead of .9 without having to reindex my entire project? When I initially
created the project the job ran for a week. Now it does incremental updates in
about 1-2 hours/day.. Can I continue doing this with the new version? or do I
have to redo it from scratch?
so ... you should probably follow kahs notes -
http://blogs.sun.com/kah/entry/opengrok_0_10
you can most probably use 0.10 without reindex, BUT
navigate will not work for sure + you will not be able to leverage fixes
done to xref cache on existing files until they change (obviously new
files will be generated with new xref format)
so ... index from scratch is seriously recommended
if you don't mind, I'd rather elaborate on the problem of opengrok
running for a week than running it without new features, hmm?
e.g. we have several opengrok instances, and some of them index A LOT of
sources with different SCMs
The biggest problem so far was with cvs history regeneration, the rest
was quite fast.
On one of our biggest servers (~20G of sources -
hg,svn,teamware(sccs),cvs used) indexing from scratch takes roughly 2
days and that's only because of bsd historycache, which runs from remote
bsd servers which are VERY SLOW.
What we eventually do is that we temporarily disable historycache for
some of the projects we know that they are slow to generate(e.g. for
bsd, openssl), then we let opengrok do its stuff and most of the source
is ready. Then we enable history again for the slower scms and reindex
when there is spare time - e.g. over next weekend - this way the
downtime is minimal.
By disabling of history I mean e.g. moving CVS aside, or moving .svn
aside, so opengrok will not detect the scm and will not try to use it.
Another low downtime (~5mins) can be achieved by having opengrok
metadata on a zfs dataset.
You can run 0.10 indexer with different target dir (which is another
dataset) - OPENGROK_INSTANCE_BASE variable can be used for that
Once indexing is done, you just stop tomcat/glassfish, do zfs rename of
old to some backup, new to default one, then copy over the war (or do
OpenGrok deploy, ev. with OPENGROK_TOMCAT_BASE)
and start the container anew (~ 5min)
let me know if you want to pursue the long indexing problem, we can
eventually improve this time ...
xing the fingers
Lubos
Any assistance is greatly appreciated. thanks. --Alan
_______________________________________________
opengrok-dev mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opengrok-dev