subject:"Solr Nutch"

Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Walter Underwood

emental values per mime-type. > The algorithms are pluggable and overridable at any point of interest. You > can go all the way. > > -Original message- >> From:Walter Underwood <wun...@wunderwood.org> >> Sent: Wednesday 3rd August 2016 20:03 >> To: solr-u

RE: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Markus Jelsma

gust 2016 20:08 > To: solr-user@lucene.apache.org > Subject: RE: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED) > > CLASSIFICATION: UNCLASSIFIED > > Shall I assume that, even though nutch has adaptive capability, I would still > have to figure out how to trigger it to g

RE: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Markus Jelsma

lt;wun...@wunderwood.org> > Sent: Wednesday 3rd August 2016 20:03 > To: solr-user@lucene.apache.org > Subject: Re: SOLR + Nutch set up (UNCLASSIFIED) > > That’s good news. > > It should reset the interval estimate on page change instead of slowly > shortening it. > &

RE: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)

3 PM To: solr-user@lucene.apache.org Subject: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED) All active links contained in this email were disabled. Please verify the identity of the sender, and confirm the authenticity of all links contained within the message prior to copying and pas

Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Walter Underwood

rincipal Engineer >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> >>> On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US) >> <kris.t.musshorn@mail.mil> wrote: >>> >>> CLASSIFICAT

Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Marco Scalone

ncipal Engineer > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US) > <kris.t.musshorn@mail.mil> wrote: > > > > CLASSIFICATION: UNCLASSIFIED > > > >

Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Walter Underwood

016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US) > <kris.t.musshorn@mail.mil> wrote: > > CLASSIFICATION: UNCLASSIFIED > > We are currently using ultraseek and looking to deprecate it in favor of > solr/nutch. > Ultraseek runs all the time and auto d

SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)

CLASSIFICATION: UNCLASSIFIED We are currently using ultraseek and looking to deprecate it in favor of solr/nutch. Ultraseek runs all the time and auto detects when pages have changed and automatically reindexes them. Is this possible with SOLR/nutch? Thanks, Kris

Solr Nutch

2014-01-28 Thread rashmi maheshwari

Hi, Question1 -- When Solr could parse html, documents like doc, excel pdf etc, why do we need nutch to parse html files? what is different? Questions 2: When do we use multiple core in solar? any practical business case when we need multiple cores? Question 3: When do we go for cloud? What is

Re: Solr Nutch

2014-01-28 Thread Jack Krupansky

for both scaling of query response and availability if nodes go down. -- Jack Krupansky -Original Message- From: rashmi maheshwari Sent: Tuesday, January 28, 2014 11:36 AM To: solr-user@lucene.apache.org Subject: Solr Nutch Hi, Question1 -- When Solr could parse html, documents like

Re: Solr Nutch

2014-01-28 Thread Jorge Luis Betancourt Gonzalez

Q1: Nutch doesn’t only handle the parse of HTML files, it also use hadoop to achieve large-scale crawling using multiple nodes, it fetch the content of the HTML file, and yes it also parse its content. Q2: In our case we use sold to crawl some website, store the content in one “main” solr

Re: Solr Nutch

2014-01-28 Thread Alexei Martchenko

down. -- Jack Krupansky -Original Message- From: rashmi maheshwari Sent: Tuesday, January 28, 2014 11:36 AM To: solr-user@lucene.apache.org Subject: Solr Nutch Hi, Question1 -- When Solr could parse html, documents like doc, excel pdf etc, why do we need nutch to parse html

Re: Solr Nutch

2014-01-28 Thread rashmi maheshwari

To: solr-user@lucene.apache.org Subject: Solr Nutch Hi, Question1 -- When Solr could parse html, documents like doc, excel pdf etc, why do we need nutch to parse html files? what is different? Questions 2: When do we use multiple core in solar? any practical business case when

Re: Solr Nutch

2014-01-28 Thread Markus Jelsma

Message- From: rashmi maheshwari Sent: Tuesday, January 28, 2014 11:36 AM To: solr-user@lucene.apache.org Subject: Solr Nutch Hi, Question1 -- When Solr could parse html, documents like doc, excel pdf etc, why do we need nutch to parse html files? what is different? Questions

Re: Solr Nutch

2014-01-28 Thread Alexei Martchenko

collections and multiple replicas for both scaling of query response and availability if nodes go down. -- Jack Krupansky -Original Message- From: rashmi maheshwari Sent: Tuesday, January 28, 2014 11:36 AM To: solr-user@lucene.apache.org Subject: Solr Nutch Hi

Re: Solr Nutch

2014-01-28 Thread rashmi maheshwari

Krupansky -Original Message- From: rashmi maheshwari Sent: Tuesday, January 28, 2014 11:36 AM To: solr-user@lucene.apache.org Subject: Solr Nutch Hi, Question1 -- When Solr could parse html, documents like doc, excel pdf etc, why do we need nutch

Re: Solr Nutch

2014-01-28 Thread Koji Sekiguchi

1. Nutch follows the links within HTML web pages to crawl the full graph of a web of pages. In addition, I think Nutch has PageRank-like scoring function as opposed to Lucene/Solr, those are based on vector space model scoring. koji --

AjaxSolr + Solr + Nutch question

2012-07-14 Thread praful

?? Thanks Regards Praful Bagai -- View this message in context: http://lucene.472066.n3.nabble.com/AjaxSolr-Solr-Nutch-question-tp3995030.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Spellcheck in solr-nutch integration

2011-02-05 Thread 666

Hello Anurag, I'm facing the same problem. Will u please elaborate on how u solved the problem? It would be great if u give me a step by step description as I'm new in Solr. -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration

Re: Spellcheck in solr-nutch integration

2011-02-05 Thread Anurag

on how u solved the problem? It would be great if u give me a step by step description as I'm new in Solr. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration

Re: Spellcheck in solr-nutch integration

2010-11-29 Thread Anurag

i solved the problemAll we need to modify schema file. Also the spellcheck index is created first when spellcheck.build=true - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration-tp1953232p1988252.html Sent from

Spellcheck in solr-nutch integration

2010-11-23 Thread Anurag

in this Solr-nutch integration. I have got a separate Solr-1.4 where there are options available for Spellcheck. What i want to ask is... 1.Indexing for spellcheck is to be done as the same time of indexing the contents.?What are the steps to follow? 2.How can i implement spellcheck in solr-nutch

Seeking Solr/Nutch consultant in San Jose, CA

2009-09-30 Thread Leann Pereira

Hi, I am working with a SaaS vendor who is integrated with Nutch 0.9 and SOLR. We are looking for some help to migrate this to Nutch 1.0. The work involves: 1) We made changes to Nutch 0.9; these need to be ported to Nutch 1.0. 2) Configure SOLR integration with Nutch 1.0 3)

Re: solr nutch url indexing

2009-08-26 Thread last...@gmail.com

Uri Boness wrote: Well... yes, it's a tool the Nutch ships with. It also ships with an example Solr schema which you can use. hi, is there any documentation to understand what going in the schema ? requestHandler name=/nutch class=solr.SearchHandler lst name=defaults str

Re: solr nutch url indexing

2009-08-26 Thread Uri Boness

Do you mean the schema or the solrconfig.xml? The request handler is configured in the solrconfig.xml and you can find out more about this particular configuration in http://wiki.apache.org/solr/DisMaxRequestHandler?highlight=(CategorySolrRequestHandler)|((CategorySolrRequestHandler)). To

Re: solr nutch url indexing

2009-08-25 Thread Thibaut Lassalle

Thanks for your help. I use the default Nutch configuration and I use solrindex to give the Nutch result to Solr. I have results when I query therefore Nutch works properly (it gives a url, title, content ...) I would like to query on Solr to emphase the title field and not the content field.

Re: solr nutch url indexing

2009-08-25 Thread Uri Boness

It seems to me that this configuration actually does what you want - queries on title mostly. The default search field doesn't influence a dismax query. I would suggest you to include the debugQuery=true parameter, it will help you figure out how the matching is performed. You can read more

RE: solr nutch url indexing

2009-08-25 Thread Fuad Efendi

Thanks for the link, so, SolrIndex is NOT plugin, it is an application... I use similar approach... -Original Message- From: Uri Boness Hi, Nutch comes with support for Solr out of the box. I suggest you follow the steps as described here:

Re: solr nutch url indexing

2009-08-25 Thread Uri Boness

Well... yes, it's a tool the Nutch ships with. It also ships with an example Solr schema which you can use. Fuad Efendi wrote: Thanks for the link, so, SolrIndex is NOT plugin, it is an application... I use similar approach... -Original Message- From: Uri Boness Hi, Nutch comes

solr nutch url indexing

2009-08-24 Thread Lassalle, Thibaut

Hi, I would like to crawl intranets with nutch and index them with solr. I would like to search mostly on the title of the pages (the one in titleThis is a title/title) I tried to tweak the schema.xml to do that but nothing is working. I just have the content indexed. How do I

Re: solr nutch url indexing

2009-08-24 Thread Uri Boness

How did you configure nutch? Make sure you have the parse-html and index-basic configured. The HtmlParser should by default extract the page title and add to the parsed data, and the BasicIndexingFilter by default adds this title to the NutchDocument and stores it in the title filed. All the

RE: solr nutch url indexing

2009-08-24 Thread Fuad Efendi

Is SolrIndex plugin for Nutch? Thanks! -Original Message- From: Uri Boness [mailto:ubon...@gmail.com] Sent: August-24-09 4:42 PM To: solr-user@lucene.apache.org Subject: Re: solr nutch url indexing How did you configure nutch? Make sure you have the parse-html and index-basic

NYC Apache Lucene/Solr/Nutch/etc. Meetup

2009-07-03 Thread Grant Ingersoll

Hi All, (sorry for the cross-post) For those in NYC, there will be a Lucene ecosystem (Lucene/Solr/Mahout/ Nutch/Tika/Droids/Lucene ports) Meetup on July 22, hosted by MTV Networks and co-sponsored with Lucid Imagination. For more info and to RSVP, see

Re: Snipets Solr/nutch

2008-04-15 Thread khirb7

to Solr. thank you mike. -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16708645.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Snipets Solr/nutch

2008-04-15 Thread Mike Klaas

On 15-Apr-08, at 1:37 PM, khirb7 wrote: Thank you a lot you are helpful, concerning my solr I am using the 1.2.0 version i download it from the Apache download mirror http://www.apache.org/dyn/closer.cgi/lucene/solr/ , I haven't well understand you when you said : you're trying to apply a

Re: Snipets Solr/nutch

2008-04-13 Thread khirb7

=org.apache.solr.highlight.GapFragmenter default=true still use fragsize=100 but i am using int name=hl.fragsize400/int as shown above. thank you. -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16656960.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Snipets Solr/nutch

2008-04-10 Thread khirb7

to modify it. all that in order to not return the first word encountered highlighted but to return an other one because of the problem I explained in my previous messages Cheers -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16603642.html Sent from the Solr

Re: Snipets Solr/nutch(maxFragSize?)

2008-04-10 Thread khirb7

to highlight not only the first occurrence of a searched word but up to 1 occurrence of the same word. cheers -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16608806.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Snipets Solr/nutch

2008-04-10 Thread Mike Klaas

On 10-Apr-08, at 12:26 AM, khirb7 wrote: hello every body just one other question, to analyse and modify Solr's snippet, I want to know if org.apache.solr.util.HighlightingUtils is the class generating the snippet and which methode generate them, and could you please explain me how are

Snipets Solr/nutch

2008-04-07 Thread khirb7

them to my solr. thank you in advence. -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16537216.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Snipets Solr/nutch

2008-04-07 Thread khirb7

attention to the punctuation (the comma or the capital letter) thank you in advence. -- View this message in context: http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16537460.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Snipets Solr/nutch

2008-04-07 Thread Mike Klaas

On 7-Apr-08, at 7:12 AM, khirb7 wrote: khirb7 wrote: hello every body I am using solr in my project, and I want to use solr snipets generated by the highlighting. The problem is that these snipets aren't really well displayed, they are trancated and not really meanigful. I heard that

42 matches

Mail list logo