Re: lucene and heritrix

2005-05-18 Thread Otis Gospodnetic
You can use Nutch - nutch.org will take you to its home. Otis --- Edward Quick <[EMAIL PROTECTED]> wrote: > Hi, > > Is it possible to use Lucene with Heritrix in order to create a web > search > engine? I read on the Lucene FAQ that opensource crawlers such as > Heritrix > are available, and g

Re: lucene / clucene

2005-05-25 Thread Otis Gospodnetic
lucene4c is a baby still. You should ask about CLucene on the CLucene list. Somebody just asked about that... maybe it was you :) CLucene folks still seem to want to join Lucene at Apache. Otis --- John Paige <[EMAIL PROTECTED]> wrote: > I have looked at Lucene4C, it looks to me that it is in

Re: indexing FTP or HTTP or Database

2005-07-19 Thread Otis Gospodnetic
For indexing FTP and HTTP servers, see Nutch (sub-project of Lucene). For indexing a DB you can write some custom JDBC to pull your data from DB and index it with Lucene. I imagine a few other people will email suggestions ;) Otis --- Bassem Elsayed <[EMAIL PROTECTED]> wrote: > How can I

Re: OR'ed boolean queries

2005-07-21 Thread Otis Gospodnetic
The problem is that you name a lot of NAMEFILEs that start with "ef". "A lot" means "more than 1024": http://lucene.apache.org/java/docs/api/org/apache/lucene/search/BooleanQuery.html#getMaxClauseCount() You could change it with this: http://lucene.apache.org/java/docs/api/org/apache/lucene/searc

Re: Lucene as xml store

2005-07-21 Thread Otis Gospodnetic
Hi Namrata, Yes, you would need to parse the XML. Here is one way to do it: http://www-128.ibm.com/developerworks/java/library/j-lucene/ Otis --- Namrata Kumari <[EMAIL PROTECTED]> wrote: > > hi, > > I am a beginner to lucene , So kindly excuse me if the questions > mentioned a > bit na

RE: search alogorithm in Lucene

2005-08-08 Thread Otis Gospodnetic
If you need to index XML with Lucene, you can look at my article about using Digester+Lucene to parse+index XML documents. The article can be found on the IBM developerWorks site. You can also look at the code that comes with Lucene in Action where we show how to parse with Digester and SAX 2.0 AP

Re: IndexWriter and IndexReader open at the same time

2005-08-08 Thread Otis Gospodnetic
If you have the Lucene book, look at Chapter 2 (page 59 under section 2.9 (Concurrency, thread-safety, and locking issues) in chapter 2 (Indexing)): http://www.lucenebook.com/search?query=concurrency+rules Also, look at Lucene's Bugzilla, where you'll find a contribution that helps with concurr

Re: lucene webdemo on a webhosting hosting account

2005-08-30 Thread Otis Gospodnetic
Gaston, You need to have just that 1 Lucene Jar file, and make sure that that Jar file is in your CLASSPATH. If you need help understanding how CLASSPATH works, please see http://www.google.com/search?q=classpath%20tutorial . Otis --- Gasi <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > f

Re: lucene webdemo on a webhosting hosting account

2005-08-31 Thread Otis Gospodnetic
error that this directives cannot be found on my > >> webspace: > >> "org.apache.lucene.analysis.*, org.apache.lucene.document.*, > >> org.apache.lucene.index.*, org.apache.lucene.search.*, > >> org.apache.lucene.queryParser.*, org.apache.lucene.demo.*, >

Re: lucene webdemo on a webhosting hosting account

2005-09-01 Thread Otis Gospodnetic
ts where results have to > exits, > because the index is proper uploaded. I think that it is a very small > > mistakebut I cannot find it.GreetingsGaston > > > - Original Message - > From: "Otis Gospodnetic" <[EMAIL PROTECTED]> > To: > Sent

Re: VSM in Lucene, again

2005-09-04 Thread Otis Gospodnetic
Hi Fredrik, Are you looking for org.apache.lucene.search.DefaultSimilarity ? Otis --- Fredrik Andersson <[EMAIL PROTECTED]> wrote: > Hi folks. > > I read a transcript from last months digest of this list, in a post > by > Rajesh Munavalli, that Lucene uses a VSM retrieval method. In my > prev

Re: Exception when crawl trys to finish...

2005-09-07 Thread Otis Gospodnetic
You should email [EMAIL PROTECTED] list, where Nutch users "hang out". Otis --- Christian Aschoff <[EMAIL PROTECTED]> wrote: > Hi, > > after three days of crwling the intranet, the nutch crawler throwed > an exception :-( > > It seems that the crawler wants to do something with the .DS_stor

Re: How to install lucene on windows ?

2005-09-11 Thread Otis Gospodnetic
All you really need is: http://apache.oc1.mirrors.redwire.net/jakarta/lucene/binaries/lucene-1.4.3.jar Otis --- Arpit Sharma <[EMAIL PROTECTED]> wrote: > I have installed tomcat on XP but when I go to page > http://apache.oc1.mirrors.redwire.net/jakarta/lucene/binaries/ > > > it shows lot's of

Re: Problem of indexing pdf files

2005-09-11 Thread Otis Gospodnetic
That's a log4j warning message, because one of the PDFBox classes is trying to log something, and you don't have log4j configured appropriately. This is not a Lucene issue, and it's a warning, so you can ignore it if you want. Otis --- tirupathi reddy <[EMAIL PROTECTED]> wrote: > Hello, > >

Re: crawling protected pages

2005-09-11 Thread Otis Gospodnetic
I don't think there is. I assume you are considering using Nutch. If so, use nutch-user mailing list instead. You'll get more help there. Otis --- Edward Quick <[EMAIL PROTECTED]> wrote: > Hi, > > I need to crawl a site that is protected. Is there currently any way > to do > this with nutch

Re: How to install lucene on windows ?

2005-09-12 Thread Otis Gospodnetic
Hello Arpit, I suggest you take a look at Lucene in Action. Chapter 1 is free and downloadable, and contains an explanation of what Lucene is and what it is not. It also has some code to get you started and give you an idea about how Lucene can be used. Otis --- Arpit Sharma <[EMAIL PROTECTED]

RE: Binary fields in index

2005-09-26 Thread Otis Gospodnetic
One of the Jakarta Commons ones - jakarta.apache.org/commons/codec/ Otis --- Tricia Williams <[EMAIL PROTECTED]> wrote: > Which library can Base64 be found in? > > Thanks, > Tricia > > On Mon, 26 Sep 2005, Koji Sekiguchi wrote: > > > You can encode (e.g. base64) the binary data to get a Stri

Re: updating Lucene Index

2006-01-27 Thread Otis Gospodnetic
I haven't followed the whole thread, but this looks weird: reader = reader.open(dir); if (reader != null) { reader.open(index_path); reader.open called twice? Why? Also, you'll get more help on [EMAIL PROTECTED] list. Otis - Original Message From: "Kodumuri, Madhav

Re: Hi doubt

2006-02-23 Thread Otis Gospodnetic
Ask on the nutch-user list, and I bet you'll quickly get your answer. My guess is yes, but Nutch developers purposely kept things simple. Also, Wildcard and a large-scale search engine/index may not go too well together for performance reasons. Otis - Original Message From: Raghaven

Re: What are the pros and cons of using the C# version of Lucene as compared to the Java version in a .NET environment?

2006-05-09 Thread Otis Gospodnetic
I have never used Lucene under Windows, but I do know that some quite high profile Internet companies have used Lucene.net port and are happy with it. See http://xanga.com Otis - Original Message From: George Carrette <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent: Tuesday, Ma

Re: What are the pros and cons of using the C# version of Lucene as compared to the Java version in a .NET environment?

2006-05-09 Thread Otis Gospodnetic
> now > are general Lucene issues (such as scaling a large index and reducing > indexing times) rather than C# vs. Java issues. I highly recommend it. > > Monsur > Xanga.com > > > > > -Original Message- > > From: Otis Gospodnetic [mailto:[EMAIL PR

Re: Searching On Image...

2006-07-31 Thread Otis Gospodnetic
Actually, look at the Lucene group on Simpy - http://www.simpy.com/group/363 - there is an application or two there that deal with image searching and use Lucene, if I recall correctly. Otis - Original Message From: Erik Hatcher <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent: M

Re: How to add an Arabic language analyzer to Lucene

2006-08-13 Thread Otis Gospodnetic
Hi Mahmoud, That's great! You should email the Lucene.net list, though. Any chance you'll port this to Java? Over at Lucene java, we'd love to have an Arabic analyzer. By the way, how does your analyzer compare to AraMorph in terms of precission and recall? Otis - Original Message

Re: Infrastructure for large Lucene index

2006-10-11 Thread Otis Gospodnetic
It sounds like the 11th node would have to have a large disk with all indices. Or perhaps you'd keep copies of all your indices elsewhere, and would pull the right one in when you see which node you need to replace. Otis - Original Message From: Slava Imeshev <[EMAIL PROTECTED]> To: g

Re: [Fwd: [PROPOSAL] index server project]

2006-10-20 Thread Otis Gospodnetic
That's distributed indexed, built on top of Sun Grid. The project won a $50K prize. - Original Message From: Alexandru Popescu <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent: Thursday, October 19, 2006 10:19:00 AM Subject: Re: [Fwd: [PROPOSAL] index server project] I am not su

Re: [Fwd: [PROPOSAL] index server project]

2006-10-20 Thread Otis Gospodnetic
Damn Y! mail shortcut. The link to the project is in my Lucene group: http://www.simpy.com/group/363 Otis - Original Message From: Alexandru Popescu <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent: Thursday, October 19, 2006 10:19:00 AM Subject: Re: [Fwd: [PROPOSAL] index server

Re: CLucene incubation - call for a mentor

2006-10-20 Thread Otis Gospodnetic
Hi Ben, I can't volunteer, but you may want to check with Garrett Rooney. He stopped work on lucene4c, so he may be interested in helping you with moving CLucene under Apache Lucene. Otis - Original Message From: Ben van Klinken <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent:

Re: Lucene indexes in memory

2007-02-14 Thread Otis Gospodnetic
Deepa, You probably want to ask on [EMAIL PROTECTED] list. Lucene reads in the whole .tii index file (see the Lucene for explanations of various Lucene index files). It doesn't read in *all* the index files, as those could be quite big. You *can* read in your index in a RAMDirectory via FSDirecto

Re: [VOTE] Sponsor the Tika proposal

2007-03-13 Thread Otis Gospodnetic
+1 Otis - Original Message From: Jukka Zitting <[EMAIL PROTECTED]> To: general@lucene.apache.org Sent: Tuesday, March 13, 2007 5:35:02 PM Subject: [VOTE] Sponsor the Tika proposal Hi, I'd like to call a vote on the Lucene PMC to sponsor the Tika proposal [1] in the Apache Incubator. Th

Re: lucenebook.com down

2007-03-22 Thread Otis Gospodnetic
Hi Marco, Lucenebook.com is taking a break, siesta, so if you need to get a copy of Lucene in Action, you can get it from http://www.manning.com/hatcher2/ or from http://amazon.com . If you just want to grab the code, you can get that via http://www.manning.com/hatcher2/ as well. Otis . . .

Re: Searching sub string

2008-01-29 Thread Otis Gospodnetic
Wolfgang is right, but you can also enable the leading wildcard in the QueryParser, I believe, plus you can index reversed tokens and stick to the trailing wildcards. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Wolfgang Täger <[EMAIL PROTE

Re: Google Summer of Code

2008-03-19 Thread Otis Gospodnetic
Bok Marko, Very interested. I suggest you continue the discussion on [EMAIL PROTECTED], though (CC-ing) You should note that there are several efforts around distributed Lucene. There is SOLR-303 for distributed search, and there is some work in progress in Hadoop land around distributed ind

Re: how to control the disk size of the indices

2008-03-24 Thread Otis Gospodnetic
Hi Yannis, I don't think there is anything of that sort in Lucene, but this shouldn't be hard to do with a process outside Lucene. Of course. optimizing an index increases its size temporarily, so your external process would have to take that into account and play it safe. You could also set

Re: Improving indexing and some questions

2008-03-25 Thread Otis Gospodnetic
Marko, You are not getting any responses here because this general@ list is pretty empty. Please email java-user list. I mentioned this in my previous reply, but for some reason you didn't go for it. Please see http://wiki.apache.org/lucene-java/HowToContribute Otis -- Sematext -- http://sema

Re: Wildcard Search over multiple fields

2008-05-07 Thread Otis Gospodnetic
Hello, Wildard queries are inefficient in general. But it sounds like you simply want to combine them into a BooleanQuery where each clause is a SHOULD clause. A better place to ask is java-user list. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message --

Re: Relevence Feedback

2008-05-17 Thread Otis Gospodnetic
It doesn't. But do some searching, I know somebody did contribute blind reference feedback implementation to Lucene. The reference to that contribution must be out there somewhere. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: DanaWhite

Re: Multi-Processor Indexing

2008-05-20 Thread Otis Gospodnetic
Try using 8-9 indexing threads, all of which share the same IndexWriter instance. It would be great if you could report some observations or numbers. A better place to ask questions like this is on [EMAIL PROTECTED] (more people). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nut

Re: Boolen operators

2008-05-22 Thread Otis Gospodnetic
Hi, They are not valid and my guess is that if you feed them to QueryParser you'll get an exception. A better place to ask questions about Lucene Java is on java-user list. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Manish Singh <[EMA

Re: Lucene is not able to index certain words of txt file converted form pdf

2008-06-18 Thread Otis Gospodnetic
Hi, Use java-user list, there are more people on it. You need to change the setting in IndexWriter that tells Lucene how many tokens froma a document to index. By default it indexes only 10,000. I can't remember the parameter name, but look at the IndexWriter javadocs, it's right there. Oti

Re: Local Lucene and Local Solr

2008-08-25 Thread Otis Gospodnetic
1. sounds like the right choice to me. On the topic of committing early, would committing it and allowing people to svn up/co, build locally, and implement the missing pieces not get us faster to the point of being able to release it? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr -

Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread Otis Gospodnetic
Hi, You may want to ask on the java-user list (more subscribers), which I'm CC-ing, so we can continue discussion there. I think you will have to implement your own logic that runs on A and does something like this: - stop adding new docs - call commit on the IndexWriter - copy the index - res

Re: Updation of field/metadata value in a document

2008-09-30 Thread Otis Gospodnetic
Yes, but only if all your fields are stored (and not just indexed). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Gaurav Sharma <[EMAIL PROTECTED]> > To: general@lucene.apache.org > Sent: Tuesday, September 30, 2008 6:34:14 AM > Subject: U

Re: Lucene Index file vs. database

2008-09-30 Thread Otis Gospodnetic
Zoki, A better list to ask this on is [EMAIL PROTECTED] In short, you can really go either way. Some people feel more comfortable storing everything in DB as they trust it more (RDBMS's have been around longer than Lucene has), know how to back it up, need data integrity (FKs), etc. Storing

Re: [VOTE] Graduate Tika to a Lucene subproject (Subproject Acceptance Vote)

2008-10-20 Thread Otis Gospodnetic
+1 and I look forward to seeing Tika closer to Lucene. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Jukka Zitting <[EMAIL PROTECTED]> > To: general@lucene.apache.org > Cc: [EMAIL PROTECTED] > Sent: Monday, October 20, 2008 8:17:47 PM > S

Re: Which one is better - Lucene OR Google Search Appliance

2008-12-04 Thread Otis Gospodnetic
I saw a pile of responses for this thread, but didn't read them all carefully. How much exactly will you be paying for a GSA that is licensed to handle 8M docs? I *believe* you'll be paying around $250,000. Now ask yourself if you can hire somebody to develop exactly what you want for less.

Re: Synchronization and merging indexes

2008-12-21 Thread Otis Gospodnetic
Logan, My guess is you'll get more help if you post your question to the Lucene.Net mailing list (and whose address I don't recall off the top of my head). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: chaiguy1337 > To: general@lucene.

Re: Spatial Search using Lucene and a Database

2008-12-27 Thread Otis Gospodnetic
Marc, I don't have a direct answer to your question, but I'd like to point out http://wiki.apache.org/lucene-java/SpatialSearch . It is still very bare, but this Spatial Contrib is about to get a good amount of attention. If you have more questions, I suggest you use java-user list instead of

Re: Welcome PyLucene

2009-01-09 Thread Otis Gospodnetic
Welcome! Do we need a new PyLucene tab now? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Andi Vajda > To: general@lucene.apache.org > Sent: Thursday, January 8, 2009 10:51:51 PM > Subject: Re: Welcome PyLucene > > > On Thu, 8 Jan 200

Re: PyLucene news

2009-01-27 Thread Otis Gospodnetic
And now we are almost running out of space for those Lucene subproject tabs! :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Michael McCandless > To: general@lucene.apache.org > Sent: Saturday, January 24, 2009 6:20:45 AM > Subject: Re:

Re: Allow committers from any subproject to edit TLP site

2009-03-24 Thread Otis Gospodnetic
+1 Otis - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Saturday, March 21, 2009 1:04:05 PM > Subject: Allow committers from any subproject to edit TLP site > > What do people think of allowing any subproject committer the right to edit > the >

Re: Open Relevance Project?

2009-05-17 Thread Otis Gospodnetic
Not sure if this was mentioned before, but hm, I was going to point out http://index.isc.org/ (see http://ioiblog.wordpress.com/2008/11/07/kicking-off-the-ioi-blog/ ), but the server doesn't seem to be listening aha, here: http://ioiblog.wordpress.com/2009/02/ Perhaps we can get data

Re: what if my database data contains other language (like danish, german).

2009-05-17 Thread Otis Gospodnetic
Chris, I don't have the issue number here, but look in Lucene's JIRA and search for... ah, here: https://issues.apache.org/jira/browse/LUCENE-1166 And for Chinese: https://issues.apache.org/jira/browse/LUCENE-1629 If you happen to be using Solr: http://www.sematext.com/product-multil

Re: Open Relevance Project?

2009-05-18 Thread Otis Gospodnetic
I agree! A a matter of fact, that is exactly what I just wrote here: http://www.jroller.com/otis/entry/followup_open_relevance_project#comment-1242703187000 "For example, couldn't a vendor use it to compare old implementation to new implementation and provide some kind of metric showing impr

benchmark contrib, wikipedia, publishing results

2009-05-18 Thread Otis Gospodnetic
Been thinking about ORP on and off all day today... and Mark brought up the benchmark contrib. Shouldn't we publish Lucene results for that somewhere on the site? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: RAM or File?

2009-05-26 Thread Otis Gospodnetic
Yes. I remember having a very hard time showing that RAMDirectory is faster than FSDirectory back in 2004 while writing Lucene in Action No. 1. If you run the unit test that's supposed to show it, I think you'll see this. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch --

Re: [VOTE] Make the Open Relevance Project (ORP) and official Lucene subproject

2009-05-28 Thread Otis Gospodnetic
+1 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Thursday, May 28, 2009 7:26:35 AM > Subject: [VOTE] Make the Open Relevance Project (ORP) and official Lucene > subproject > > I'

Re: Open Relevance Project Kickoff

2009-06-02 Thread Otis Gospodnetic
Hello, 2. How about going with -user immediately, so that when we do add -user and -dev we don't have to migrate people? 3. Can Confluence send emails with page changes that look like diffs? I'm asking because MoinMoin does that and I find it very helpful. Mahout's Confluence doesn't do that

Re: [ORP] Fwd: Confluence email diffs

2009-06-10 Thread Otis Gospodnetic
Excellent, thanks for figuring this out. That link to diffs either wasn't there before or I never noticed it until now. But that works for me! Otis - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Tuesday, June 9, 2009 11:06:42 AM > Subject: [ORP]

Re: Using Lucene to index OSM nodes (400M latitude/longitude points)

2009-06-23 Thread Otis Gospodnetic
Hi Kelly, I think you want to look at LocalLucene (or LocalSolr). I haven't played with Local*, so I can't provide more than this tip. Actually, I can also suggest to dump Plucene - it's a dead project, and even when it was alive it was quite slow. If you really need to be able to search fr

Re: Index Ratio

2009-06-24 Thread Otis Gospodnetic
Hi Brett, Try creating a simple MS Word document with just a single character in it. Save it as .doc and check the size. Export to PDF and check the size. I don't know exactly how big those docs will be, but I bet they'll be many, many times larger than that one byte character. Open up you

Re: [ORP] JIRA

2009-06-24 Thread Otis Gospodnetic
And 2 brand new mailing lists you can subscribe to: openrelevance-user-subscr...@lucene.apache.org openrelevance-dev-subscr...@lucene.apache.org Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Grant Ingersoll > To: general@lucene.apache

Re: [VOTE] Graduate Lucene.Net as a subproject under Apache Lucene

2009-10-09 Thread Otis Gospodnetic
+1 Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message > From: George Aroush > To: general@lucene.apache.org > Sent: Thu, October 8, 2009 6:04:09 PM > Subject: [VOTE] Graduate Lucene.Ne

Lucene.net move time

2010-01-07 Thread Otis Gospodnetic
George & Co. (Doug & DIGY the mystery man) I see Lucene.net has its new ML archived under http://mail-archives.apache.org/mod_mbox/lucene-lucene-net-dev|user -- good What's the timeline for moving the site, getting the Wiki, and moving the svn repo? Is there an INFRA ticket for that? May want

NYC Search in the Cloud meetup: Jan 20

2010-01-12 Thread Otis Gospodnetic
Hello, If "Search Engine Integration, Deployment and Scaling in the Cloud" sounds interesting to you, and you are going to be in or near New York next Wednesday (Jan 20) evening: http://www.meetup.com/NYC-Search-and-Discovery/calendar/12238220/ Sorry for dupes to those of you subscribed to mul

Re: [VOTE] merge lucene/solr development

2010-03-04 Thread Otis Gospodnetic
+1 this is software. let's try it. if it doesn't work out, we know what to do. Otis - Original Message > From: Yonik Seeley > To: general@lucene.apache.org > Sent: Wed, March 3, 2010 5:42:38 PM > Subject: [VOTE] merge lucene/solr development > > Many Lucene/Solr committers think t

Re: [VOTE] merge lucene/solr development

2010-03-04 Thread Otis Gospodnetic
- Original Message > From: Andi Vajda > To: general@lucene.apache.org > Sent: Thu, March 4, 2010 2:13:22 PM > Subject: Re: [VOTE] merge lucene/solr development > > > On Thu, 4 Mar 2010, Mark Miller wrote: > > > Who knows - this isn't the official count - just a gauge of what has > >

Poll: solr-dev/java-dev overlap

2010-03-04 Thread Otis Gospodnetic
Out of curiosity. Please reply ONLY if your are a Solr committer and the answer to any of the questions is Yes. 1) Are you on solr-dev, but NOT on java-dev? 2) Are you active on solr-dev, but do NOT ACTIVELY FOLLOW java-dev? 3) Are you active on solr-dev, but do NOT ACTIVELY PARTICIPATE on

Re: [VOTE] merge lucene/solr development

2010-03-04 Thread Otis Gospodnetic
- Original Message > From: Uwe Schindler > To: general@lucene.apache.org > Sent: Thu, March 4, 2010 11:19:47 AM > Subject: RE: [VOTE] merge lucene/solr development > > If we vote on what Mike says, I revise my vote and simply vote +/-0 to not > stop > progress. I have some problem wit

Re: [VOTE] merge lucene/solr development

2010-03-04 Thread Otis Gospodnetic
- Original Message > From: Bill Au > To: general@lucene.apache.org > Sent: Wed, March 3, 2010 11:29:33 PM > Subject: Re: [VOTE] merge lucene/solr development > > In the case where changes are in Lucene only I think it is OK to increment > the Solr release number since even though the So

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-09 Thread Otis Gospodnetic
Hello, (just using Yonik's email to reply, but my comments are more general) - Original Message > From: Yonik Seeley > To: general@lucene.apache.org > Sent: Tue, March 9, 2010 10:04:20 AM > Subject: Re: [VOTE] merge lucene/solr development (take 3) > > On Tue, Mar 9, 2010 at 9:48 AM,

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-09 Thread Otis Gospodnetic
What do people think about doing what I wrote above as step 1 in this whole process? When that is done in N months, we can see if we can improve on it? This would also fit "progress, not perfection" mantra. Otis - Original Message ---- > From: Otis Gospodnetic > To: gene

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Would it be correct to say that in order to have a voting be perfectly clear, the VOTE thread should have just the votes and no comments/discussion? Otis - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Fri, March 12, 2010 11:02:34 AM > Subject: Re: [

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hi, Would it be correct to say that a subset of Lucene/Solr committers discussed the proposal "internally/offline" (i.e. not on MLs) before proposing it? Thanks, Otis

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hello, - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Fri, March 12, 2010 11:21:57 AM > Subject: Re: [VOTE] merge lucene/solr development (take 3) > > On Mar 12, 2010, at 11:07 AM, Mattmann, Chris A (388J) wrote: > > Here's what I didn't like. The

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hello, - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Fri, March 12, 2010 12:03:07 PM > Subject: Re: [VOTE] merge lucene/solr development (take 3) > > On Mar 12, 2010, at 11:54 AM, patrick o'leary wrote: >>> Go > look at the votes. > > Which ones

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hi, But remember the early days of this (or these) vote threads. I recall some people saying things like "I won't vote -1 since I don't want to veto the proposal, so I'll vote +|-0". I recall Doug being one of those people. I don't think we heard back from Doug in subsequent vote threads. I

Less drastic ways

2010-03-14 Thread Otis Gospodnetic
Hi, Consider this just an email to clarify things for Otis (and maybe a few other people). Are the following the main goals of the recent merge voting thread(s)? * Make it easier for Solr to ride the Lucene trunk * Make it easier for people to avoid committing new features to Solr when they rea

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hi, - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Tue, March 9, 2010 5:00:42 PM > Subject: Re: [VOTE] merge lucene/solr development (take 3) > > On Mar 9, 2010, at 12:38 PM, Otis Gospodnetic wrote: > > > > * I

Re: Poll: solr-dev/java-dev overlap

2010-03-14 Thread Otis Gospodnetic
Hi, Hey Grant, I'm not picking on you, I swear! :) - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Wed, March 10, 2010 6:56:57 AM > Subject: Re: Poll: solr-dev/java-dev overlap > > On Mar 4, 2010, at 3:52 PM, Otis Gospo

Re: Less drastic ways

2010-03-14 Thread Otis Gospodnetic
Hello, - Original Message > From: Grant Ingersoll > To: general@lucene.apache.org > Sent: Sun, March 14, 2010 12:40:51 PM > Subject: Re: Less drastic ways > > On Mar 14, 2010, at 12:28 PM, Otis Gospodnetic wrote: > > Hi, > > Consider this just an em

Re: [VOTE] merge lucene/solr development (take 3)

2010-03-14 Thread Otis Gospodnetic
Hi, - Original Message > From: Yonik Seeley > To: general@lucene.apache.org > Sent: Sun, March 14, 2010 3:48:10 PM > Subject: Re: [VOTE] merge lucene/solr development (take 3) > > On Sun, Mar 14, 2010 at 2:36 PM, Otis Gospodnetic < > ymailto="mail

Re: Less drastic ways

2010-03-14 Thread Otis Gospodnetic
I don't get it, Mike. :) Even if we merge Lucene/Solr and we treat Solr as just another Lucene contrib/module, say, contributors who care only about Solr will still patch against Solr and Lucene developers or those people who have the itch for that functionality being in Lucene, too, will still

Re: Branding Solr+Lucene

2010-03-23 Thread Otis Gospodnetic
I wish we could somehow lose that "Lucene-Java". The -Java part seems so forced, like we couldn't come up with anything better. Otis - Original Message > From: Chris Hostetter > To: "general@lucene.apache.org" > Sent: Tue, March 23, 2010 2:43:36 PM > Subject: RE: Branding Solr+Lucen

Re: java.io.IOException: read past EOF

2010-03-23 Thread Otis Gospodnetic
Jean-Michael, java-u...@lucene is a better place to ask. I'd do this: * back up your index * use CheckIndex tool (if it existed in your version of Lucene?) Maybe Luke version you are using has a mismatching Lucene version? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hado

Re: Should I avoid MultiFieldQueryParser?

2010-05-31 Thread Otis Gospodnetic
What you lose by aggregating all real fields into 1 field is the ability to give fields different scoring weights. Is a match in the post title equally important as a match in the body or in one of the comments? If yes, then aggregate. Otis Sematext :: http://sematext.com/ :: Solr - Lucene

Re: [PMC] Next Steps on Lucene.NET

2010-12-24 Thread Otis Gospodnetic
Personally, I would be *very* interested whether moving Lucene.NET to GitHub will make a difference in terms of progress and style of development. Maybe forking, pull requests, and the whole "social" thing makes it easier for people to participate. Since Lucene.NET has struggled for years at A

Re: TF & IDF values for a search term

2010-12-24 Thread Otis Gospodnetic
Vikas, look at DefaultSimilarity. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: vikas kumar > To: general@lucene.apache.org > Sent: Thu, December 16, 2010 6:53:24 AM > Subject: TF & I

Re: Apache Solr is not available

2010-12-25 Thread Otis Gospodnetic
Hi, I think you'll get more help if you ask Drupal community. That error message is specific to Drupal. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: nitishgarg > To: general@luce

Re: Get last search data from SOLR

2011-01-18 Thread Otis Gospodnetic
Jotta, You may want to ask on solr-user list in the future. If you are asking whether Solr can tell you what was the last document that Solr returned to the last query it executed, the answer is no. Maybe you can describe what you are trying to accomplish, so we can help you. Email solr-user

Re: Number of Boolean Clauses (AND vs OR)

2011-04-11 Thread Otis Gospodnetic
I believe AND will be faster, at least in cases when one of the earlier clauses doesn't actually match any docs, in which case the whole query should terminate early and not evaluate the remaining clauses. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem searc

Re: is query cache persisted?

2011-04-12 Thread Otis Gospodnetic
Hi, Are you using raw Lucene or Solr? If Solr, your query is probably cached in the query results cache (see your solrconfig.xml). Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Yan

Re: Query performance, bitset, btree

2011-05-29 Thread Otis Gospodnetic
Hello, I'm guessing both Lucene and a DB (relational or not) may be about the same here. Query like name="John" and age=30 and city="London" could be done with either, but if you think you'll need to expand those queries to include full-text search, then I'd go with Lucene (or Solr). Otis ---

Re: Lucene: is it possible to search with an error in one letter?

2011-05-30 Thread Otis Gospodnetic
Hi, Yes, penc?l should do it. Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: boraldo > To: general@lucene.apache.org > Sent: Mon, May 30, 2011 8:08:54 AM > Subject: Lucene: is it possi

Re: CLOSE_WAIT after connecting to multiple shards from a primary shard

2011-05-30 Thread Otis Gospodnetic
Hi, A few things: 1) why not send this to the Solr list? 2) you talk about searching, but the code sample is about optimizing the index. 3) I don't have SolrJ API in front of me, but isn't there is CommonsSolrServe ctor that takes in a URL instead of HttpClient instance? Try that one. Otis ---

Re: Multiple Solr replicaton threads

2011-09-05 Thread Otis Gospodnetic
Ram, What is x in your case and how much data needs to be replicated each time, roughly? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: bramsreddy >To: general@lucene.apache.org

Re: Analyzer for code?

2011-09-07 Thread Otis Gospodnetic
Alan, if you have Lucene in Action 2, there is a case study that describes how Krugle did this. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: Alan Williamson (aw2.0 cloud experts)

Re: Suggestions or best practices for indexing the logs

2011-10-17 Thread Otis Gospodnetic
Alex, You could try compressing the content field - that might help a bit. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: Alex Shneyderman >To: general@lucene.apache.org >Sent: T

Re: Performance issues with high query volume

2011-12-16 Thread Otis Gospodnetic
Hi, It's hard to tell.  For example, the problem could "simply" be with the JVM and how it is tuned or not tuned, not at all a Solr problem.  Or it could be a Solr problem. Try using SPM (link below, it's free) - this will help you figure out what is going on and see what effect any changes yo

Re: [Announce] Solr 3.5 with RankingAlgorithm 1.3, NRT support

2011-12-28 Thread Otis Gospodnetic
Hi, Is there a writeup that describes how this compares to NRT support in development version of Solr? Otis Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html > > From: Nagendra Nagarajayya >To: genera

Re: Licensing questions

2012-05-13 Thread Otis Gospodnetic
Hi Tiffany, Apache Lucene is free.  There is no corporation behind it.  It is released under Apache Software License by the Apache Software Foundation. Otis  Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm  > > From: Tiff

  1   2   >