Hi,

I'm looking for some advice whether Jackrabbit might be a good choice for my 
problem. Any comments on this are greatly appreciated.


= Short description of the challenge =

We've built a Eclipse based tool that analyzes java source files and stores its 
analysis results in additional files. The workspace  potentially has hundreds 
of projects and each project may have up to a few thousands of files. Say, 
there will be 200 projects and 1000 java source files per project in a single 
workspace. Then, there will be 200*1000 = 200.000 files.

On a full workspace build, all these 200k files have to be compiled (by the 
IDE) and analyzed (by our tool) at once and the analysis results have to be 
dumped to disk rather fast.
But the most common use case is that a single file is changed several times per 
minute and thus gets frequently analyzed.

At the moment, the analysis results are dumped on disk as plain json files; one 
json file for each java class. Each json file is around 5 to 100kb in size; 
some files grow up to several megabytes (<10mb), these files have a few hundred 
JSON complex nodes (which might perfectly map to nodes in JCR).

= Question =

We would like to change the simple file system approach by a more sophisticated 
approach and I wonder whether Jackrabbit may be a suitable backend for this use 
case. Since we map all our data to JSON already, it looks like Jackrabbit/JCR 
is a perfect fit for this but I can't say for sure. 

What's your suggestion? Is Jackrabbit capable to quickly load and store 
json-like data - even if 200k files (nodes + their sub-nodes) have to be 
updated very in very short time?
 

Thanks for your suggestions. I've you need more details on what operations are 
performed or how data looks like, I would be glad to take your questions.

Marcel

-- 
Eclipse Code Recommenders:
 w www.eclipse.org/recommenders
 tw www.twitter.com/marcelbruch
 g+ www.gplus.to/marcelbruch

Reply via email to