Re: hit scoring on latest build

2002-11-04 Thread aaron J titus
Thank you Doug. I should have looked there first. Cheers, Aaron -- On Mon, 04 Nov 2002 14:16:20 Doug Cutting wrote: >If you check the CHANGES file for changes made since the 1.2 release, >you'll find: > >Added support for boosting the score of documents and fields via the >new metho

Re: want to know, java search engine which can crawl local and remote websites and cr indexes

2002-11-04 Thread Otis Gospodnetic
Man, you have got to look at the web site first, I can't serve all the info! :) There is a whole nice, long Technical Overview of LARM available at the site. You can get LARM out of the CVS and try it. I would not call it supported sw, as it hasn't even beeen released yet. Good luck. Otis --- n

Re: want to know, java search engine which can crawl local and remote websites and cr indexes

2002-11-04 Thread nandkumar rayanker
thanks .. how to get some more information about it and who is responsible it? thanks and regards Nandkumar --- Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > I think somebody already mentioned LARM. > > Otis > > --- nandkumar rayanker <[EMAIL PROTECTED]> > wrote: > > Hi, > > > > I am developin

Re: hit scoring on latest build

2002-11-04 Thread Doug Cutting
If you check the CHANGES file for changes made since the 1.2 release, you'll find: Added support for boosting the score of documents and fields via the new methods Document.setBoost(float) and Field.setBoost(float). Note: This changes the encoding of an indexed value. Indexes should

Re: want to know, java search engine which can crawl local and remote websites and cr indexes

2002-11-04 Thread Otis Gospodnetic
I think somebody already mentioned LARM. Otis --- nandkumar rayanker <[EMAIL PROTECTED]> wrote: > Hi, > > I am developing an search application which needs a > Java search engine with ability to crawl local and > remote websites and create indexes. > > I tried using i2a.WebSearch but not sure a

Re: Any comments on i2a.WebSearch ?

2002-11-04 Thread Otis Gospodnetic
>From what I remember: not scalable, more of a proof of concept implementation. Otis --- nandkumar rayanker <[EMAIL PROTECTED]> wrote: > Thanks for the reply.. > Yes it did answer few questions.. > > I would like to have some comments on "i2a.WebSearch" > which is powered by Lucene. > > Regard

want to know, java search engine which can crawl local and remote websites and cr indexes

2002-11-04 Thread nandkumar rayanker
Hi, I am developing an search application which needs a Java search engine with ability to crawl local and remote websites and create indexes. I tried using i2a.WebSearch but not sure about support and stability...? need your input ASAP? If not can I use Lucene to implement required features.

Re: Modify demo jar files, build question

2002-11-04 Thread Brian Cuttler
Otis, Yes, I was pretty much assuming it was something that stupid. Can you tell me though which class files are supposed to be being seen during compilation so that I can determine where the classpath should point ? > Yes, you need to set your CLASSPATH properly. I won't go into that > here, b

Any comments on i2a.WebSearch ?

2002-11-04 Thread nandkumar rayanker
Thanks for the reply.. Yes it did answer few questions.. I would like to have some comments on "i2a.WebSearch" which is powered by Lucene. Regards, Nandkumar --- "Abhay Y. Saswade" <[EMAIL PROTECTED]> wrote: > Read FAQ no. 10 from Official Lucene site. > http://lucene.sourceforge.net/cgi-bin/fa

Re: I am getting following error wheni search string with "-" "DNS-456-333",need help ASAP.

2002-11-04 Thread Otis Gospodnetic
The '-' character is a special character. One should be able to escape it using the '\' character. Otis --- nandkumar rayanker <[EMAIL PROTECTED]> wrote: > Hi, > > When tried to search string for e.g "DNS-456-333" i > get following error any idea? need help ASAP. > > thanks > Nandkumar > error

I am getting following error wheni search string with "-" "DNS-456-333",need help ASAP.

2002-11-04 Thread nandkumar rayanker
Hi, When tried to search string for e.g "DNS-456-333" i get following error any idea? need help ASAP. thanks Nandkumar error message: == ps\websearch\WEB-INF\search/dscccols/index com.lucene.queryParser.ParseException: Encountered "-" at line 0, column 9. Was expecting one of: .

hit scoring on latest build

2002-11-04 Thread aaron J titus
Hello Everyone, I have just downloaded the newest of the nightly builds (10/27) from the apache.org site because I was looking for the specific feature of being able to control the default conjunction. I noticed that the hit scoring has changed drastically since the release build. I checked bug

Re: Deleting fields from a Document

2002-11-04 Thread Doug Cutting
Kelvin Tan wrote: Document maintains a linked list of Fields. It would be not be difficult to delete a random Field, albeit a little inefficient. That would delete it from the in-memory representation, but, once it has been indexed, there is no easy way to remove a field value from a document

RE: Searching big documents

2002-11-04 Thread Alex Murzaku
I don't think it has to do with the K's (128K) but rather with the maximum number of words: /* By default, no more than 10,000 terms will be indexed for a field. */ public int maxFieldLength = 1; -- Alex Murzaku ___ alex(at)lissus.com http://www.li

Re: Searching big documents

2002-11-04 Thread Otis Gospodnetic
IndexWriter.java: public int maxFieldLength = 1; You need to set this to some higher number. Otis --- Marcus Ericsson <[EMAIL PROTECTED]> wrote: > Hello. I have a problem searching big documents. If "content" is to > big > (over 128k) I can find the text in lucenes datafiles but I dont get

Searching big documents

2002-11-04 Thread Marcus Ericsson
Hello. I have a problem searching big documents. If "content" is to big (over 128k) I can find the text in lucenes datafiles but I dont get searchhits on words after 128k. /Marcus ~~ ~~ Marcus Ericsson ~~ Greta Adrians väg 5B ~~ 703 53 Örebro ~~ 019-25 25 99

RE: indexing other documents (.doc .pdf .txt ...)

2002-11-04 Thread Otis Gospodnetic
This is a FAQ answered on jGuru, and I think there are some useful links on the Lucene Contributions page on Lucene's site. Otis --- "Murthy, Suryanarayana (MED, TCS)" <[EMAIL PROTECTED]> wrote: > Where is this class? > > -Original Message- > From: Vinod Bhagat [mailto:vbhagat@;blastradi

RE: indexing other documents (.doc .pdf .txt ...)

2002-11-04 Thread Vinod Bhagat
well lucene can not directly index pdf.. u have to extract text from pdf ad Jpedal is a good library that i used to extract text from pdf and than lucene api's can index it. u neeed to downlaod Jpedal library from http://www.jpedal.org/ and the mentioned class is there in the examp

RE: indexing other documents (.doc .pdf .txt ...)

2002-11-04 Thread Murthy, Suryanarayana (MED, TCS)
Where is this class? -Original Message- From: Vinod Bhagat [mailto:vbhagat@;blastradius.com] Sent: Monday, November 04, 2002 4:59 PM To: 'Lucene Users List' Subject: RE: indexing other documents (.doc .pdf .txt ...) look at the ExtracttextObjects.java class.. this is ur answer for pdf.

Re: Lucene and XML

2002-11-04 Thread Otis Gospodnetic
These is a DOM and a SAX2 example in Lucene-Sandbox. http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/XML-Indexing-Demo/ Otis --- "Richly, Gerhard" <[EMAIL PROTECTED]> wrote: > Hello together, > > > Who knows an easy, stable, already finished, tool or an extension of > Lucene

RE: indexing other documents (.doc .pdf .txt ...)

2002-11-04 Thread Vinod Bhagat
look at the ExtracttextObjects.java class.. this is ur answer for pdf vin. -Original Message- From: Friaa Nafaa [mailto:friaa@;excite.com] Sent: Monday, November 04, 2002 5:04 PM To: [EMAIL PROTECTED] Subject: indexing other documents (.doc .pdf .txt ...) Can I index pdf or doc

indexing other documents (.doc .pdf .txt ...)

2002-11-04 Thread Friaa Nafaa
Can I index pdf or doc or txt documents with lucene ? and how I procede to do this ?I have installed a demo copy of Lucene and whene I index a set of documents, lucene index only html documents and no pdf or doc.thanks. ___ Join Excite! - http://www.e

Re: "Low Index Operation"

2002-11-04 Thread Otis Gospodnetic
Maybe you can merge your indices: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#addIndexes(org.apache.lucene.store.Directory[]) Otis --- Paul Bozan MCR <[EMAIL PROTECTED]> wrote: > Hi > I have two separate index directories, Index1 and Index2. > I want move on

Lucene and XML

2002-11-04 Thread Richly, Gerhard
Hello together, Who knows an easy, stable, already finished, tool or an extension of Lucene, where i can index XML-Files?? Unfortunately you can¡ät download the tool from ISOGEN, it is only a demo-version. It should also have a good documentation, it is for a beginner. Could you please tell

Re: Indexing distant web sites

2002-11-04 Thread Karl Øie
oh, sorry.. i was perhaps not making me self clear here... you will have to use the crawler to retrieve the content and store it locally for indexing, so you will have to set up your crawler to fetch a site and store every html page's content to disk, then run Lucene on the locally stored ht

Re: Indexing distant web sites

2002-11-04 Thread Friaa Nafaa
Thank you,I was installed this crawler and I run it,but I would like to index the web site and not to list the visited links by the crawler,Is there a way to serch a web page by lucene witch use this crawler for visiting the pages.thanks--- On Mon 11/04, Karl Marx < [EMAIL PROTECTED] > wrote:F

RE: Working with a Distributed System

2002-11-04 Thread Rob Outar
Thank you all for replying and I will let u know how it goes. Thanks, Rob -- To unsubscribe, e-mail: For additional commands, e-mail:

"Low Index Operation"

2002-11-04 Thread Paul Bozan MCR
Hi I have two separate index directories, Index1 and Index2. I want move one or many document(s) from Index1 in Index2 . All fields are UnStored . Can anyone explain how can I do this ? Thanks. -- To unsubscribe, e-mail: For additional comman

Re: Indexing distant web sites

2002-11-04 Thread Karl Marx
As stated in the official FAQ Lucene doesn't implement a web-crawler, you can however use a self-made crawler or customate a crawler framework like websphinx (http://www-2.cs.cmu.edu/~rcm/websphinx/) to retrieve html documents from a site and then feed them to Lucene. mvh karl øie On Monday,

Re: Getting the last modified date

2002-11-04 Thread Karl Marx
To convert the long to a string you can use the java.text.DateFormat class. mvh karl øie On Monday, Nov 4, 2002, at 11:22 Europe/Oslo, Friaa Nafaa wrote: Hello,Pleease can I get the last modified date of an indexed document by lucene, I know how to use the field "modified" of the file but th

Indexing distant web sites

2002-11-04 Thread Friaa Nafaa
Hello,is there any way to index web sites by lucene, assuming we know only the url of the site ? :-->In local use we passe to lucene the full arborexcence or directory of our site (contain all the documents) and we begin the indexing operation, but when I would like to index a distant site on t

Getting the last modified date

2002-11-04 Thread Friaa Nafaa
Hello,Pleease can I get the last modified date of an indexed document by lucene, I know how to use the field "modified" of the file but this field return a long that can't be interpreted as a date (like a string ex. : A65tR887). ___ Join Excite! - htt