Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Doron Cohen
Hi Grant, "safer" was not the best wording, sorry for that - I meant performance wise, there's no correctness issue. The "contract" of the two next methods as I understand it is that a TS must implement one of them. I see no harm in implementing the two (but doing so is likely to just duplicate T

Doubts with Lucene 2.2.0 - How to do the Indexing lucene´s file.

2007-12-28 Thread Jesiel Trevisan
Hi everyone, I just have a little doubts with Lucene 2.2.0 I got the web example of Lucene Search, but, it does not have nothing about indexing file example, there is only the Searcher Function example. I just would like to see or read something about How Can I do/make the indexing lucene files.

Re: site javadocs link broken

2007-12-28 Thread Doron Cohen
Hi Steven, It is the trunk Javadocs that are missing for me - I don't see the link to Hudson. Going to the main project page - http://lucene.apache.org/java is redirecting to - http://lucene.apache.org/java/docs/index.html and then after navigating to - Documentation --> Javadocs there are 4 link

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Yonik Seeley
On Dec 28, 2007 8:20 AM, Doron Cohen <[EMAIL PROTECTED]> wrote: > The "contract" of the two next methods as I understand it is that > a TS must implement one of them. I see no harm in implementing > the two (but doing so is likely to just duplicate TokenStream's code.) I don't think the contract w

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Doron Cohen
> > > a TS must implement one of them. I see no harm in implementing > > the two (but doing so is likely to just duplicate TokenStream's code.) > > I don't think the contract was ever laid out so strictly. I think > it's fine for any TokenStream to implement both if it's advantageous > to do so. >

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Yonik Seeley
On Dec 28, 2007 8:43 AM, Doron Cohen <[EMAIL PROTECTED]> wrote: > > > > > a TS must implement one of them. I see no harm in implementing > > > the two (but doing so is likely to just duplicate TokenStream's code.) > > > > I don't think the contract was ever laid out so strictly. I think > > it's f

Re: site javadocs link broken

2007-12-28 Thread Grant Ingersoll
http://www.gossamer-threads.com/lists/lucene/java-dev/55471?search_string=javadocs;#55471 On Dec 28, 2007, at 8:33 AM, Doron Cohen wrote: Hi Steven, It is the trunk Javadocs that are missing for me - I don't see the link to Hudson. Going to the main project page - http://lucene.apache.org/ja

Re: site javadocs link broken

2007-12-28 Thread Doron Cohen
Thanks Grant, I see it now, totally missed that thread. Is there a reason to keep the self "javadocs" link in http://lucene.apache.org/java/docs/javadocs.html ? On Dec 28, 2007 3:54 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > http://www.gossamer-threads.com/lists/lucene/java-dev/55471?se

Re: site javadocs link broken

2007-12-28 Thread Grant Ingersoll
That's auto-generated by Forrest, same as on all the other pages. Is kind of silly though, when there is only 1 section. -Grant On Dec 28, 2007, at 9:05 AM, Doron Cohen wrote: Thanks Grant, I see it now, totally missed that thread. Is there a reason to keep the self "javadocs" link in http

[jira] Commented: (LUCENE-1058) New Analyzer for buffering tokens

2007-12-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554794 ] Grant Ingersoll commented on LUCENE-1058: - http://www.gossamer-threads.com/lists/lucene/java-dev/56255 Chan

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Doron Cohen
On Dec 28, 2007 3:54 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Dec 28, 2007 8:43 AM, Doron Cohen <[EMAIL PROTECTED]> wrote: > > > > > > > a TS must implement one of them. I see no harm in implementing > > > > the two (but doing so is likely to just duplicate TokenStream's > code.) > > > > >

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Doron Cohen
On Dec 28, 2007 4:10 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I'm fine w/ making this change. No sense in implementing both as we > can just rely on next(Token) to call next(). I will commit the change > and put a comment on the issue that created the SinkTokenizer. > Cool thanks!

Re: SinkTokenizer: next(Token) vs. next()

2007-12-28 Thread Grant Ingersoll
I'm fine w/ making this change. No sense in implementing both as we can just rely on next(Token) to call next(). I will commit the change and put a comment on the issue that created the SinkTokenizer. -Grant On Dec 28, 2007, at 8:43 AM, Doron Cohen wrote: a TS must implement one of th

Re: site javadocs link broken

2007-12-28 Thread Doron Cohen
On Dec 28, 2007 4:17 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > That's auto-generated by Forrest, same as on all the other pages. Is > kind of silly though, when there is only 1 section. Oh I see, what confused me is that the section and the page have almost identical names. Any objection

RE: site javadocs link broken

2007-12-28 Thread Steven A Rowe
Hi Doron, On 12/28/2007 at 8:33 AM, Doron Cohen wrote: > It is the trunk Javadocs that are missing for me - I don't see > the link to Hudson. [...] > In what page do you see a link to > http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/index.html?

Re: site javadocs link broken

2007-12-28 Thread Grant Ingersoll
On Dec 28, 2007, at 9:50 AM, Doron Cohen wrote: On Dec 28, 2007 4:17 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: That's auto-generated by Forrest, same as on all the other pages. Is kind of silly though, when there is only 1 section. Oh I see, what confused me is that the section and t

RE: Doubts with Lucene 2.2.0 - How to do the Indexin g lucene´s file.

2007-12-28 Thread Steven A Rowe
Hi Jesiel, Here's a good place to start: http://wiki.apache.org/lucene-java/Resources Steve On 12/28/2007 at 8:27 AM, Jesiel Trevisan wrote: > Hi everyone, > > I just have a little doubts with Lucene 2.2.0 > > I got the web example of Lucene Search, but, it does not have > nothing about >

Re: site javadocs link broken

2007-12-28 Thread Doron Cohen
Hi Steve, thanks, I missed that thread, now I can see it. Thanks, Doron On Dec 28, 2007 5:17 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote: > Hi Doron, > > On 12/28/2007 at 8:33 AM, Doron Cohen wrote: > > It is the trunk Javadocs that are missing for me - I don't see > > the link to Hudson. > [...]

[jira] Created: (LUCENE-1102) EnwikiDocMaker id field

2007-12-28 Thread Grant Ingersoll (JIRA)
EnwikiDocMaker id field --- Key: LUCENE-1102 URL: https://issues.apache.org/jira/browse/LUCENE-1102 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Reporter: Grant Ingers

EnwikiDocMaker

2007-12-28 Thread Grant Ingersoll
I am using EnwikiDocMaker with the following algorithm outlined at the bottom (against trunk). After the first round is complete, I am getting java.lang.RuntimeException: java.io.IOException: Bad file descriptor at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker $Parser.run(EnwikiDocM

[jira] Updated: (LUCENE-1102) EnwikiDocMaker id field

2007-12-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1102: Attachment: LUCENE-1102.patch Adds docid Field to the index for EnwikiDocMaker > EnwikiDo

[jira] Created: (LUCENE-1103) WikipediaTokenizer

2007-12-28 Thread Grant Ingersoll (JIRA)
WikipediaTokenizer -- Key: LUCENE-1103 URL: https://issues.apache.org/jira/browse/LUCENE-1103 Project: Lucene - Java Issue Type: Improvement Components: Analysis Reporter: Grant Ingersoll Ass

Re: Doubts with Lucene 2.2.0 - How to do the Indexing lucene´s file.

2007-12-28 Thread Jesiel Trevisan
Thanks Steven, I saw this link, it´s nice, but... for me, it still does not clear how to do a Indexing my webSite. For example, I have in my website this struct below: - Index.jsp tv.jsp dvd.jsp service01.js

Re: Doubts with Lucene 2.2.0 - How to do the Inde xing lucene´s file.

2007-12-28 Thread Grant Ingersoll
Please ask these types of questions on java-user. Java-dev is reserved for Lucene internal development However, it is your responsibility to do the crawling. You might look into Nutch or Aperture or some other crawler that is aware of these kinds of contexts. Lucene is agnostic of the co

RE: Doubts with Lucene 2.2.0 - How to do the Indexin g lucene´s file.

2007-12-28 Thread Steven A Rowe
Hi Jesiel, On 12/28/2007 at 2:23 PM, Jesiel Trevisan wrote: > I saw this link, it´s nice, but... for me, it still does not > clear how to do a Indexing my webSite. Lucene is not a search engine - it is a toolkit of parts you may use (a.k.a. a "library") to build a search engine. The Lucene Java

Re: site javadocs link broken

2007-12-28 Thread Doron Cohen
On Dec 28, 2007 5:33 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > On Dec 28, 2007, at 9:50 AM, Doron Cohen wrote: > > Oh I see, what confused me is that the section and the > > page have almost identical names. Any objection to renaming > > the section to "Javadocs for Official Releases"? > >

Build failed in Hudson: Lucene-Nightly #317

2007-12-28 Thread hudson
See http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/317/changes -- [...truncated 766 lines...] A contrib/snowball/src/java A contrib/snowball/src/java/net A contrib/snowball/src/java/net/sf A contrib/snowball/src

Hudson build is back to normal: Lucene-Nightly #318

2007-12-28 Thread hudson
See http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/318/changes - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]