emental values per mime-type.
> The algorithms are pluggable and overridable at any point of interest. You
> can go all the way.
>
> -Original message-
>> From:Walter Underwood <wun...@wunderwood.org>
>> Sent: Wednesday 3rd August 2016 20:03
>> To: solr-u
gust 2016 20:08
> To: solr-user@lucene.apache.org
> Subject: RE: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED)
>
> CLASSIFICATION: UNCLASSIFIED
>
> Shall I assume that, even though nutch has adaptive capability, I would still
> have to figure out how to trigger it to g
lt;wun...@wunderwood.org>
> Sent: Wednesday 3rd August 2016 20:03
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR + Nutch set up (UNCLASSIFIED)
>
> That’s good news.
>
> It should reset the interval estimate on page change instead of slowly
> shortening it.
>
&
3 PM
To: solr-user@lucene.apache.org
Subject: [Non-DoD Source] Re: SOLR + Nutch set up (UNCLASSIFIED)
All active links contained in this email were disabled. Please verify the
identity of the sender, and confirm the authenticity of all links contained
within the message prior to copying and pas
rincipal Engineer
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/ (my blog)
>>
>>
>>> On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US)
>> <kris.t.musshorn@mail.mil> wrote:
>>>
>>> CLASSIFICAT
ncipal Engineer
> wun...@wunderwood.org
> http://observer.wunderwood.org/ (my blog)
>
>
> > On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US)
> <kris.t.musshorn@mail.mil> wrote:
> >
> > CLASSIFICATION: UNCLASSIFIED
> >
> >
016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US)
> <kris.t.musshorn@mail.mil> wrote:
>
> CLASSIFICATION: UNCLASSIFIED
>
> We are currently using ultraseek and looking to deprecate it in favor of
> solr/nutch.
> Ultraseek runs all the time and auto d
CLASSIFICATION: UNCLASSIFIED
We are currently using ultraseek and looking to deprecate it in favor of
solr/nutch.
Ultraseek runs all the time and auto detects when pages have changed and
automatically reindexes them.
Is this possible with SOLR/nutch?
Thanks,
Kris
Hi,
Question1 -- When Solr could parse html, documents like doc, excel pdf
etc, why do we need nutch to parse html files? what is different?
Questions 2: When do we use multiple core in solar? any practical business
case when we need multiple cores?
Question 3: When do we go for cloud? What is
for both scaling of query response
and availability if nodes go down.
-- Jack Krupansky
-Original Message-
From: rashmi maheshwari
Sent: Tuesday, January 28, 2014 11:36 AM
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi,
Question1 -- When Solr could parse html, documents like
Q1: Nutch doesn’t only handle the parse of HTML files, it also use hadoop to
achieve large-scale crawling using multiple nodes, it fetch the content of the
HTML file, and yes it also parse its content.
Q2: In our case we use sold to crawl some website, store the content in one
“main” solr
down.
-- Jack Krupansky
-Original Message- From: rashmi maheshwari
Sent: Tuesday, January 28, 2014 11:36 AM
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi,
Question1 -- When Solr could parse html, documents like doc, excel pdf
etc, why do we need nutch to parse html
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi,
Question1 -- When Solr could parse html, documents like doc, excel pdf
etc, why do we need nutch to parse html files? what is different?
Questions 2: When do we use multiple core in solar? any practical
business
case when
Message- From: rashmi maheshwari
Sent: Tuesday, January 28, 2014 11:36 AM
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi,
Question1 -- When Solr could parse html, documents like doc, excel pdf
etc, why do we need nutch to parse html files? what is different?
Questions
collections and multiple replicas for both scaling of query
response
and availability if nodes go down.
-- Jack Krupansky
-Original Message- From: rashmi maheshwari
Sent: Tuesday, January 28, 2014 11:36 AM
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi
Krupansky
-Original Message- From: rashmi maheshwari
Sent: Tuesday, January 28, 2014 11:36 AM
To: solr-user@lucene.apache.org
Subject: Solr Nutch
Hi,
Question1 -- When Solr could parse html, documents like doc, excel
pdf
etc, why do we need nutch
1. Nutch follows the links within HTML web pages to crawl the full graph of a
web of pages.
In addition, I think Nutch has PageRank-like scoring function as opposed to
Lucene/Solr, those are based on vector space model scoring.
koji
--
??
Thanks
Regards
Praful Bagai
--
View this message in context:
http://lucene.472066.n3.nabble.com/AjaxSolr-Solr-Nutch-question-tp3995030.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello Anurag, I'm facing the same problem. Will u please elaborate on how u
solved the problem? It would be great if u give me a step by step
description as I'm new in Solr.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration
on how u
solved the problem? It would be great if u give me a step by step
description as I'm new in Solr.
--
If you reply to this email, your message will be added to the discussion
below:
http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration
i solved the problemAll we need to modify schema file.
Also the spellcheck index is created first when spellcheck.build=true
-
Kumar Anurag
--
View this message in context:
http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration-tp1953232p1988252.html
Sent from
in this
Solr-nutch integration.
I have got a separate Solr-1.4 where there are options available for
Spellcheck.
What i want to ask is...
1.Indexing for spellcheck is to be done as the same time of indexing the
contents.?What are the steps to follow?
2.How can i implement spellcheck in solr-nutch
Hi,
I am working with a SaaS vendor who is integrated with Nutch 0.9 and SOLR. We
are looking for some help to migrate this to Nutch 1.0. The work involves:
1) We made changes to Nutch 0.9; these need to be ported to Nutch 1.0.
2) Configure SOLR integration with Nutch 1.0
3)
Uri Boness wrote:
Well... yes, it's a tool the Nutch ships with. It also ships with an
example Solr schema which you can use.
hi,
is there any documentation to understand what going in the schema ?
requestHandler name=/nutch class=solr.SearchHandler
lst name=defaults
str
Do you mean the schema or the solrconfig.xml?
The request handler is configured in the solrconfig.xml and you can find
out more about this particular configuration in
http://wiki.apache.org/solr/DisMaxRequestHandler?highlight=(CategorySolrRequestHandler)|((CategorySolrRequestHandler)).
To
Thanks for your help.
I use the default Nutch configuration and I use solrindex to give the Nutch
result to Solr. I have results when I query therefore Nutch works properly
(it gives a url, title, content ...)
I would like to query on Solr to emphase the title field and not the
content field.
It seems to me that this configuration actually does what you want -
queries on title mostly. The default search field doesn't influence a
dismax query. I would suggest you to include the debugQuery=true
parameter, it will help you figure out how the matching is performed.
You can read more
Thanks for the link, so, SolrIndex is NOT plugin, it is an application... I
use similar approach...
-Original Message-
From: Uri Boness
Hi,
Nutch comes with support for Solr out of the box. I suggest you follow
the steps as described here:
Well... yes, it's a tool the Nutch ships with. It also ships with an
example Solr schema which you can use.
Fuad Efendi wrote:
Thanks for the link, so, SolrIndex is NOT plugin, it is an application... I
use similar approach...
-Original Message-
From: Uri Boness
Hi,
Nutch comes
Hi,
I would like to crawl intranets with nutch and index them with solr.
I would like to search mostly on the title of the pages (the one in
titleThis is a title/title)
I tried to tweak the schema.xml to do that but nothing is working. I
just have the content indexed.
How do I
How did you configure nutch?
Make sure you have the parse-html and index-basic configured. The
HtmlParser should by default extract the page title and add to the
parsed data, and the BasicIndexingFilter by default adds this title to
the NutchDocument and stores it in the title filed. All the
Is SolrIndex plugin for Nutch?
Thanks!
-Original Message-
From: Uri Boness [mailto:ubon...@gmail.com]
Sent: August-24-09 4:42 PM
To: solr-user@lucene.apache.org
Subject: Re: solr nutch url indexing
How did you configure nutch?
Make sure you have the parse-html and index-basic
Hi All, (sorry for the cross-post)
For those in NYC, there will be a Lucene ecosystem (Lucene/Solr/Mahout/
Nutch/Tika/Droids/Lucene ports) Meetup on July 22, hosted by MTV
Networks and co-sponsored with Lucid Imagination.
For more info and to RSVP, see
to Solr.
thank you mike.
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16708645.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 15-Apr-08, at 1:37 PM, khirb7 wrote:
Thank you a lot you are helpful, concerning my solr I am using the
1.2.0
version i download it from the Apache download mirror
http://www.apache.org/dyn/closer.cgi/lucene/solr/ , I haven't well
understand you when you said :
you're trying to apply a
=org.apache.solr.highlight.GapFragmenter
default=true
still use fragsize=100 but i am using int name=hl.fragsize400/int as
shown above.
thank you.
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16656960.html
Sent from the Solr - User mailing list archive at Nabble.com.
to modify it. all that in order to not return the first word
encountered highlighted but to return an other one because of the problem I
explained in my previous messages
Cheers
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16603642.html
Sent from the Solr
to
highlight not only the first occurrence of a searched word but up to 1
occurrence of the same word.
cheers
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16608806.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 10-Apr-08, at 12:26 AM, khirb7 wrote:
hello every body
just one other question, to analyse and modify Solr's snippet, I
want to
know if org.apache.solr.util.HighlightingUtils
is the class generating the snippet and which methode generate them,
and
could you please explain me how are
them to my solr.
thank you in advence.
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16537216.html
Sent from the Solr - User mailing list archive at Nabble.com.
attention to the punctuation (the
comma or the capital letter)
thank you in advence.
--
View this message in context:
http://www.nabble.com/Snipets-Solr-nutch-tp16537216p16537460.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 7-Apr-08, at 7:12 AM, khirb7 wrote:
khirb7 wrote:
hello every body
I am using solr in my project, and I want to use solr snipets
generated by
the highlighting.
The problem is that these snipets aren't really well displayed,
they are
trancated and not really meanigful.
I heard that
42 matches
Mail list logo