Re: [ANNOUNCE] lucene4c 0.02

2005-02-14 Thread Erik Hatcher
On Feb 13, 2005, at 11:37 PM, Otis Gospodnetic wrote: I was going to wait a bit with inviting various Lucene ports to Lucene until we have the mailing lists set up and at lucene.apache.org, to make things a bit more tangible for people. Since these codebase imports have to come through incubation,

lucene.apache.org

2005-02-14 Thread Erik Hatcher
We now have lucene.apache.org mapped, yet we don't have a site there yet. Doug - do you have your Forest work handy? Or has anyone else stepped up to build the web site? I'll try later today to get our current site up at the new domain to have a starting point. Erik Begin

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 8:11 AM, Kelvin Tan wrote: I'm happy to help, but I haven't been keeping track of this thread.. What needs to be done, and how can I help? The idea is to Forrest-ize the Lucene site, rather than using jakarta-site2 and anakia. Doug said he's worked with Forrest some and

Re: [ANNOUNCE] lucene4c 0.02

2005-02-14 Thread Garrett Rooney
Erik Hatcher wrote: I encourage you to propose the codebase to the Incubator. For the curious, the proposal has been sent. I hope to see you all voicing your support on general@incubator.apache.org ;-) -garrett - To

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
I have checked out our current site to the lucene.apache.org area, and I've also set up a redirect from the jakarta.apache.org/lucene area. Things are redirecting fine for me. Let me know if you encounter any issues, but also be patient in case the DNS updates for lucene.apache.org have

Re: lucene.apache.org

2005-02-14 Thread Garrett Rooney
Erik Hatcher wrote: I have checked out our current site to the lucene.apache.org area, and I've also set up a redirect from the jakarta.apache.org/lucene area. Things are redirecting fine for me. Let me know if you encounter any issues, but also be patient in case the DNS updates for

Re: lucene.apache.org

2005-02-14 Thread Bernhard Messer
Erik Hatcher schrieb: I have checked out our current site to the lucene.apache.org area, and I've also set up a redirect from the jakarta.apache.org/lucene area. Things are redirecting fine for me. Let me know if you encounter any issues, but also be patient in case the DNS updates for

BitSet implementation and large index

2005-02-14 Thread tony
It seems that for a huge index, it might be a good idea to use a different implementation of the BitSet when doing filtering (assuming the non-filtered set is relatively small). This would really help minimize the memory required for each filter operation. Since the default implementation of

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Erik Hatcher wrote: Doug - do you have your Forest work handy? Or has anyone else stepped up to build the web site? I don't have anything reusable. I converted Nutch from a different (not Anakia) XML-based site to Forrest with little difficulty (mostly using string replace in Emacs). I

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Erik Hatcher wrote: I have checked out our current site to the lucene.apache.org area, and I've also set up a redirect from the jakarta.apache.org/lucene area. Keep in mind, there are two projects here: 1. Porting Java Lucene's site to Forrest. This should be structured as a sub-project of

Re: BitSet implementation and large index

2005-02-14 Thread jian chen
Hi, In database systems implementation, there is a type of index called bit map indexing. The bitset implementation could borrow idea from the database engine implementation. You could squeeze all the 0's together and write how many of those 0's, that might be very memory saving. There are

Re: [ANNOUNCE] lucene4c 0.02

2005-02-14 Thread Doug Cutting
Garrett Rooney wrote: Additionally it would be good to work on updating the disk format documentation, I've found several cases where the docs are quite out of date compared to the current code. It's hard to expect the various different ports to maintain compatibility when the formats are only

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Garrett Rooney wrote: Agreed. Java Lucene is a subproject of the Lucene TLP, leaving the existing Java Lucene site there for the time being seems ok, just so we have something there, but we should endeavour to put up something more permanent ASAP. I think, for the present,

Re: [ANNOUNCE] lucene4c 0.02

2005-02-14 Thread Garrett Rooney
Doug Cutting wrote: Garrett Rooney wrote: Additionally it would be good to work on updating the disk format documentation, I've found several cases where the docs are quite out of date compared to the current code. It's hard to expect the various different ports to maintain compatibility when

Re: lucene.apache.org

2005-02-14 Thread Garrett Rooney
Doug Cutting wrote: Garrett Rooney wrote: Agreed. Java Lucene is a subproject of the Lucene TLP, leaving the existing Java Lucene site there for the time being seems ok, just so we have something there, but we should endeavour to put up something more permanent ASAP. I think, for the present,

Transactional Directories

2005-02-14 Thread Oscar Picasso
Hi, I am currently implementing a Directory backed by a Berkeley DB that I am willing to release as an open source project. Besides the internal implementation, it differs from the one in the sandbox in that it is implemented with the Berkeley DB Java Edition. Using the Java Edition allows an

Re: What does [] do to a query and what's up with lucene.apache.org?

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 12:24 PM, Doug Cutting wrote: Otis Gospodnetic wrote: lucene.apache.org seems to work now. Here is the query syntax: http://lucene.apache.org/queryparsersyntax.html We should be cautious in promoting lucene.apache.org urls until we have this structured correctly. Let's

Re: What does [] do to a query and what's up with lucene.apache.org?

2005-02-14 Thread Doug Cutting
Erik Hatcher wrote: I'm really at the limit of my bandwidth - I've got the sandbox restructuring effort on my plate right now and would like it if someone could pick up the ball on the web site side of things. Then perhaps you shouldn't have redirected everything to lucene.apache.org... We

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 12:39 PM, Garrett Rooney wrote: Doug Cutting wrote: Garrett Rooney wrote: Agreed. Java Lucene is a subproject of the Lucene TLP, leaving the existing Java Lucene site there for the time being seems ok, just so we have something there, but we should endeavour to put up

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Erik Hatcher wrote: It also might be a good time to think about mailing list names. There was a request on infrastructure@ to move [EMAIL PROTECTED] to [EMAIL PROTECTED], would it make more sense to move it to [EMAIL PROTECTED] NOW you tell me :) I think until we have these elusive other

Re: lucene.apache.org

2005-02-14 Thread Bernhard Messer
Doug Cutting schrieb: Erik Hatcher wrote: It also might be a good time to think about mailing list names. There was a request on infrastructure@ to move [EMAIL PROTECTED] to [EMAIL PROTECTED], would it make more sense to move it to [EMAIL PROTECTED] NOW you tell me :) I think until we

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Doug Cutting wrote: And we also want to try not to break URLs when we move things. For this reason it's best to move things as few tims as possible, so that we don't end up with a confusing set of redirects. More to the point, we also want to try not to break email addresses. So the fewer

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 12:36 PM, Doug Cutting wrote: Garrett Rooney wrote: Agreed. Java Lucene is a subproject of the Lucene TLP, leaving the existing Java Lucene site there for the time being seems ok, just so we have something there, but we should endeavour to put up something more permanent

Re: BitSet implementation and large index

2005-02-14 Thread Paul Elschot
On Monday 14 February 2005 18:31, jian chen wrote: Hi, In database systems implementation, there is a type of index called bit map indexing. The bitset implementation could borrow idea from the database engine implementation. You could squeeze all the 0's together and write how many of

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Bernhard Messer wrote: Doug, you placed a copy of the website in the java directory. In both, the original and the java directory the api directory is missing. I can't copy it into because of the access rights :-( Argh. The group protection is 'lucene', as it should be, but you're not in

Re: lucene.apache.org

2005-02-14 Thread Bernhard Messer
Doug Cutting schrieb: Bernhard Messer wrote: Doug, you placed a copy of the website in the java directory. In both, the original and the java directory the api directory is missing. I can't copy it into because of the access rights :-( Argh. The group protection is 'lucene', as it should be,

Re: BitSet implementation and large index

2005-02-14 Thread Daniel Naber
On Monday 14 February 2005 16:29, [EMAIL PROTECTED] wrote: It seems that for a huge index, it might be a good idea to use a different implementation of the BitSet when doing filtering (assuming the non-filtered set is relatively small). This would really help minimize the memory required for

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
I have updated the redirects a bit more and the old /docs links now redirect to the corresponding spot on lucene.apache.org/java/docs. Let me know if there are any old links not redirecting appropriately. I changed things down one level from where Doug checked them out. So its now

Re: What does [] do to a query and what's up with lucene.apache.org?

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 1:53 PM, Doug Cutting wrote: Erik Hatcher wrote: I'm really at the limit of my bandwidth - I've got the sandbox restructuring effort on my plate right now and would like it if someone could pick up the ball on the web site side of things. Then perhaps you shouldn't have

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 2:21 PM, Doug Cutting wrote: Doug Cutting wrote: And we also want to try not to break URLs when we move things. For this reason it's best to move things as few tims as possible, so that we don't end up with a confusing set of redirects. More to the point, we also want to try

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 2:33 PM, Doug Cutting wrote: Bernhard Messer wrote: Doug, you placed a copy of the website in the java directory. In both, the original and the java directory the api directory is missing. I can't copy it into because of the access rights :-( Argh. The group protection is

Changing INDEX_INTERVAL to allow smaller memory footprint?

2005-02-14 Thread Kevin A. Burton
I started a thread about a week ago about the memory footprint of opening up a lucene index. Our 30G index takes about 980M of memory to open. Otis and some others suggested changing TermInfosWriter.INDEX_INTERVAL (or specifically a variable that is 128 by default). I grepped through the

Re: What does [] do to a query and what's up with lucene.apache.org?

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 3:10 PM, Bernhard Messer wrote: U, the whole damed thing at http://lucene.apache.org is not responding any longer I've seen that too it worked fine for a while, and then no longer. I thought it might be a temporary DNS thing. Erik

Re: Changing INDEX_INTERVAL to allow smaller memory footprint?

2005-02-14 Thread Kevin A. Burton
Kevin A. Burton wrote: I started a thread about a week ago about the memory footprint of opening up a lucene index. Ug... you know I'm sorry. Doug responded to this but I didn't see his followup. I'll try this change this week and see what happens. You can increase

Re: lucene.apache.org

2005-02-14 Thread Erik Hatcher
On Feb 14, 2005, at 4:12 PM, Bernhard Messer wrote: It seems that everything is fine now with the website :-) I noticed the /java/docs/api area is fine too now that the DNS seems to be working again. Just let me know when you're ok with me turning the redirect back on from jakarta.

Re: lucene.apache.org

2005-02-14 Thread Doug Cutting
Erik Hatcher wrote: I've amended my request for e-mail lists here with Doug's preference: http://issues.apache.org/jira/browse/INFRA-195 Do others agree this is the best approach? I don't mean to be autocratic. Do we imagine different pools of users and developers for different Lucene

Re: lucene.apache.org

2005-02-14 Thread Garrett Rooney
Doug Cutting wrote: Erik Hatcher wrote: I've amended my request for e-mail lists here with Doug's preference: http://issues.apache.org/jira/browse/INFRA-195 Do others agree this is the best approach? I don't mean to be autocratic. Do we imagine different pools of users and developers for

Re: Transactional Directories

2005-02-14 Thread Doug Cutting
Oscar Picasso wrote: Hi, I am currently implementing a Directory backed by a Berkeley DB that I am willing to release as an open source project. Besides the internal implementation, it differs from the one in the sandbox in that it is implemented with the Berkeley DB Java Edition. Using the Java

Re: Transactional Directories

2005-02-14 Thread Doug Cutting
[ Please ignore my previous message. I somehow hit Send before typing anything! ] Oscar Picasso wrote: However with a relatively high number of random insertions, the cost of the new IndexWriter / index.close() performed for each insertion is two high. Did you measure that? How much slower was

Re: What does [] do to a query and what's up with lucene.apache.org?

2005-02-14 Thread John Haxby
Erik Hatcher wrote: I've seen that too it worked fine for a while, and then no longer. I thought it might be a temporary DNS thing. I think it is. The DNS does take a little while to settle down when a new name appears: it's only just appeared in my ISP. When I was managing my (old)

Re: lucene.apache.org

2005-02-14 Thread Otis Gospodnetic
I'm with Garrett. I think we do need a top level dev@ list for discussion of cross-port issues, like index format and compatibility, etc. We also need per-port and per-app lists. Otis --- Garrett Rooney [EMAIL PROTECTED] wrote: Doug Cutting wrote: Erik Hatcher wrote: I've amended my

Re: lucene.apache.org

2005-02-14 Thread Paul Smith
Hey all, I can only suggest what we do at Apache Logging Services that seems to work well for us. Have seperate user dev list for each sub product (so a user and dev for Lucene for Java, or what ever), have a [EMAIL PROTECTED] mailing list for discussions that are relevant to all

Re: lucene.apache.org

2005-02-14 Thread Henri Yandell
I'll happily change the Jakarta site whenever you're ready. On names, Lucene Java might hit trademark issues I guess. So potential worry there. Hen - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: