There are still very SIGNIFICANT problems with his tests. 1. The environment is not "real", except for possibly desktop searching. Whether the JVM needs 64m or 128m to perform adequately is immaterial, given the price of RAM and the ease of expansion. It would be akin to saying "let's all try to write programs that work with 64k of memory", why needlessly constrain yourself? I would take performance, readability, maintainability, and reliability over memory consumption any day. If the point of the test is to show that a Java based searching system needs more memory than a script language on top of a C db library, who cares? The smallest possible Java program I can write, shows a VM size of 8.5mb under Windows XP - which is larger than the whole of Odeum/Ruby. There is a lot that the Java system provides that isn't useful in this particular use case, but I run Lucene in a multithreaded, server environment, with a test index of 350mb, and I can run it in as little as 19mb - but why would I want to?
2. The search is always for the same word. The Odeum database based version will almost certainly cache all the required data and index blocks in memory after the first run, avoiding all calls to the OS. Since Lucene performs no local caching (without my mods), it will ALWAYS require trips to the OS. Also, each run of Lucene is going to generate garbage without question. A properly designed non-Java db will almost certainly generate no increased memory usage in the constrained case. Running the tests using multiple threads on random words would be far more interesting. Lastly, for what's it's worth (and that's probably not much!) - if Odeum was the "better search engine", you could do a Java -> Odeum mapping and I GUARANTEE the Java implementation using the latest JIT JVMs compilers will be faster than the Ruby one. Also, where's my cross platform GUI for displaying the search results? The best developers attempt to use the right tool for the job. Ruby is a GREAT scripting language, and is perfect for all the things scripting languages are good for. Let's leave it at that. -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 5:09 AM To: java-dev@lucene.apache.org Subject: Re: Lucene vs. Ruby/Odeum Zed has updated his second part with more experiments with different JVM's and memory settings: http://www.zedshaw.com/projects/ruby_odeum/odeum_lucene_part2.html On Jun 2, 2005, at 12:27 AM, Robert Engels wrote: > I read all of Zed's posts on the subject and I feel he certainly > presents a > strong anti-Java Most definitely an anti-Java leaning - but at least he's working on being objective about it by measuring things :) > , if not anti-Lucene bias - maybe just pro Ruby. He's quite pro-Lucene, and most definitely pro-Ruby. I consider myself in those categories myself. > If you do not even adhere to the principle designer's "guidelines > to proper > usage", your tests are meaningless. It's akin to using a new flat > screen > monitor and claiming "boy, it has a fuzzy picture", because you didn't > follow the instructions that said "remove protective film before > using". I concur with your sentiment and I've done what I can via e-mail with him to educate him on my experience with Lucene and JVM garbage collection. I'd encourage anyone who has the the time and inclination to take him up on the request to show how to do it better since he's made his code available. > Zed is using a very constrained test - which is probably very > UNCOMMON in > the real world of server based systems, to attempt to discern the > relative > performance characteristics of Lucene/Java/Ruby/etc. The tests may be > applicable in his poorly designed environment, but he presents his > limited > finding as "gospel", and that it should hold true in all cases. I > quote... > "For the people who have no clue (also known as "Executives") > here's the > information you need to tell all your employees they need to adopt the > latest and greatest thing without ever having to understand > anything you > read. Cheaper than an article in CIO magazine and even has big > words like > "standard deviation"." and then goes on to present his "statistically > correct" performance numbers. Don't get me wrong - Zed is using inflammatory language. We should work to not lower ourselves to speaking in that same tone but rather objectively and nicely point out the errors of his ways. He's open to that despite his caustic tone - at least from the e-mail exchanges I've had with him. Erik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]