It wouldn’t be easy if in the site you’ll ensure that only terms are submitted
to the actual search? In app I worked some time ago the default behavior of the
Javascript component used for autocompletion was to first autocomplete the term
in the input and then submit the query against the
How would you measure which snippet is the best?
On Nov 9, 2014, at 1:59 PM, SolrUser1543 osta...@gmail.com wrote:
Lets say that for some query there are several results , with several hits
for each one , which shown in hightligth section of the response.
Is it possible to select only one
The whole idea behind Solr is to solve the problem that you just explain, in
particular what you need is to define the title field as a solr.TextField and
then define a tokenizer. The tokenizer essentially will transform the initial
text into tokens. Solr has several tokenizers, each which its
When you fire a query against Solr with the wt=csv the response coming from
Solr is *already* in CSV, the CSVResponseWriter is responsible for translating
SolrDocument instances into a CSV on the server side, son I don’t see any
reason on using it by your self, Solr already do the heavy lifting
I see you’re defining a default value for “rows” this could be overridden on
the request, and requesting a lot of documents from solr can stress out your
server/cluster, of course if the client in question has that many documents. if
this is a fixed value and the clients can’t request more
If you’re talking about a generic web crawl you could use something like Nutch
[1] keep in mind that his a full web crawler and it does a pretty good job.
I’ve been using it for over more than 2 years now and I’m very happy, although
I don’t crawl just a couple of sites but a more wide spectrum
Don’t worry, the way Hoss explained its indeed the way I’ve know that works,
but the example provided in the book pick my curiosity and hence the question
in this thread.
Regards,
On Sep 30, 2014, at 5:59 PM, Timothy Potter thelabd...@gmail.com wrote:
Indeed - Hoss is correct ... it's a
Perhaps instead of the suggester component you could use the EdgeNGramFilter
and provide partial matches so you will me able to configure a custom request
handler that will “suggest” terms of phrases for you. I’m using this approach
to provide queries suggestions, of course I’m indexing the
Krupansky j...@basetechnology.com wrote:
I am not aware of any such feature! That doesn't mean it doesn't exist, but I
don't recall seeing it in the Solr source code.
-- Jack Krupansky
-Original Message- From: Jorge Luis Betancourt Gonzalez
Sent: Wednesday, September 24, 2014 1:31 AM
I’ve done something similar to this using the the EdgeNGram not the
spellchecker component, I don’t know if this is along with your requirements:
The relevant portion of my fieldType config:
filter class=solr.WordDelimiterFilterFactory”
generateWordParts=1 generateNumberParts=1
Hi:
I’m trying to change the default configuration for the query component of a
SearchHandler, basically I want to set a default value to the rows parameters
and that this value be shared by all my SearchHandlers, as stated on the
solrconfig.xml comments, this could be accomplished redeclaring
can be specified, these
will be overridden by parameters in the request
--
lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftext/str
/lst
...
-- Jack Krupansky
-Original Message- From: Jorge Luis Betancourt Gonzalez
Sent
Which crawler are you using?
On Sep 18, 2014, at 10:14 AM, keeblerh keebl...@yahoo.com wrote:
eShard wrote
Good afternoon,
I'm using solr 4.0 Final
I need movies hidden in zip files that need to be excluded from the
index.
I can't filter movies on the crawler because then I would have to
Basically you could create a bunch of dynamic fields (according to your needs)
so basically creating a dynamic field for each type of data (and several
combinations) and then you can create a small wrapper around Solrj that will
wrap the patterns defined on your schema.xml in a more
In one of the talks by Trey Grainger (author of Solr in Action) it touches how
on CareerBuilder are dealing with multilingual with payloads, its a little more
of work but I think it would payoff.
On Sep 8, 2014, at 7:58 AM, Jack Krupansky j...@basetechnology.com wrote:
You also need to take
Perhaps what you’re trying to do could be addressed by using the
EdgeNGramFilterFactory filter? For query suggestions I’m using a very similar
approach, this is an extract of the configuration I’m using:
tokenizer class=solr.StandardTokenizerFactory/
filter
Hi all:
We have a small installation of Solr 3.6 in our hands, right now we have 3
physical servers (1 master and 2 slaves) the ingestion process it’s done in the
master which replicates by solr internal mechanism into the slaves, which
handles all the queries. We are trying to update to Solr
I’m using Solr for an analytic use case, one of the requirements is basically
given a search query get the position of the first hit. I’m indexing web pages,
so given a search criteria the client want’s to know the position (first
occurrence) of his webpage in the result set (if it appears at
With Regards
Aman Tandon
On Tue, Jun 24, 2014 at 4:30 AM, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
I’m using Solr for an analytic use case, one of the requirements is
basically given a search query get the position of the first hit. I’m
indexing web pages, so given a search
a faster
way to do it. But you only need to fetch the URL field. You can ignore
everything else.
wunder
On Jun 23, 2014, at 9:32 PM, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
Basically given a few search terms (query) the idea is to know given one or
more terms in which
suggest you that the if the website has
the appropriate and good data it should come on first page, so its better
to come on first page rather than finding the position.
With Regards
Aman Tandon
On Tue, Jun 24, 2014 at 10:35 AM, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote
I’ve certainly go for the 2nd option. Depending of what you need you won’t need
to modify Solr itself but extend it using different plugins for what you need.
You’ll need to write different components depending on your specific
requirements. I definitely recommend the talks from Trey Grainger,
Is there some work around in Solr ecosystem to get something similar to the
percolator feature offered by elastic search?
Greetings!VII Escuela Internacional de Verano en la UCI del 30 de junio al 11
de julio de 2014. Ver www.uci.cu
In the book Apache Solr Beginner’s Guide there is a section dedicated to write
new Solr plugins, perhaps it would be a good place to start, also in the wiki
there is a page about this, but the it’s a light introduction. I’ve found that
a very good starting point it’s just browse throw the code
Q1: Nutch doesn’t only handle the parse of HTML files, it also use hadoop to
achieve large-scale crawling using multiple nodes, it fetch the content of the
HTML file, and yes it also parse its content.
Q2: In our case we use sold to crawl some website, store the content in one
“main” solr
I’ve some experience using Solarium and have been great so far. In particular
we use the NelmioSolariumBundle to integrate with Symfony2.
Greetings!
On Jan 28, 2014, at 1:54 PM, Felipe Dantas de Souza Paiva
cad_fpa...@uolinc.com wrote:
Hi Folks,
I would like to know what is the best way
Previously in the list a spreadsheet has been mentioned, taking into account
that you already have documents in an index you could extract the needed
information from your index and feed it into the spreadsheet and it probably
will give you a rough approximated of the hardware you’ll bee
I believe that you are looking for something similar to the percolator feature
present in elasticsearch. I remember something about a solar implementation
being discussed here some time ago. Anyone knows if there have been any
progress in this area?
On Jan 27, 2014, at 8:18 AM, Furkan KAMACI
If I’m not remembering incorrectly Trey Grainger in one of his talks explained
a few techniques that could be of use. If the equivalency is not dynamically
you could just use synonyms. Otherwise some kind of offline processing should
be used to compute the similarity between your queries (given
Happy new year!
I’ve developed some custom update request processors to accomplish some custom
logic needed in some user cases. I’m trying to write test for this processor,
but I’d like to test in a very similar way of how the built in processors are
tested in the solr source code. Is there
Is it possible to export the doc into markdown?
- Mensaje original -
De: Chris Hostetter hossman_luc...@fucit.org
Para: solr-user@lucene.apache.org
Enviados: Lunes, 9 de Diciembre 2013 14:00:34
Asunto: Re: ANNOUNCE: Apache Solr Reference Guide 4.6
: Can we please give some thought to
Hi:
I'm using solr 3.6 with dismax query parser, I've found that docs that doesn't
has all the query terms get ranked above other that contains all the terms in
the search query. Using debugQuery I could see that the most part of the score
in this cases come from the coord(q,d) factor. Is
+1 on this.
- Mensaje original -
De: Otis Gospodnetic otis.gospodne...@gmail.com
Para: solr-user@lucene.apache.org
Enviados: Viernes, 6 de Diciembre 2013 9:35:25
Asunto: Re: Introducing Luwak for high-performance stored Lucene queries
Hi Charlie,
Very nice - thanks!
I'd love to see a
I think that one experience in this area could by provided by Tray Grainger,
author of Solr in Action, I believe that some of his work on careerbuilder
involve the creation of something (somehow) similar to what you're trying to
accomplish. I must say that I'm also interested in this topic, but
Perhaps what you want is a transparent proxy? You could use nginx, squid,
varnish, etc. W've been evaluating varnish as a posibility to run in front of
our solr server and take advantage of the HTTP caching that varnish does so
well.
Greetings!
- Mensaje original -
De: Markus Jelsma
Hi everybody:
Is there any way of forcing an UTF-8 conversion on the queries that are logged
into the log? I've deployed solr in tomcat7. The file appears to be an UTF-8
file but I'm seeing this in the logs:
INFO: [] webapp=/solr path=/select
I'm seeing a rare behavior of the gap fragmenter on solr 3.6. Right now this is
my configuration for the gap fragmenter:
fragmenter name=gap
default=true
class=solr.highlight.GapFragmenter
lst name=defaults
int name=hl.fragsize150/int
Are you using the suggester component? or a separated core? I've used a
separated core to store suggestions and order this suggestions (queries
performed on the frontend) using a time decay function, and it works great for
me.
Regards,
- Mensaje original -
De: SolrLover
For that core just use a boost factor as explained on [1]:
You could use a query like this to see (before make any change) how your
suggestions will be retrieved, in this case a query for goog has been made,
and recent documents will be boosted (an extra bonus will be given for the
newer
Sorry, I forgot the link:
[1] - http://wiki.apache.org/solr/SolrRelevancyFAQ
- Mensaje original -
De: Ing. Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
Para: solr-user@lucene.apache.org
Enviados: Martes, 1 de Octubre 2013 13:34:03
Asunto: Re: Auto Suggest - Time decay
of users become a query, this
query should already be in the cache. This are just thoughts but I hope could
be useful to you.
Regards,
- Mensaje original -
De: Ing. Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
Para: solr-user@lucene.apache.org
Enviados: Viernes, 27 de Septiembre 2013
with some arbitrary number?
On Thursday, September 26, 2013, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
Great!! I haven't see your message yet, perhaps you could create a PR to
that Github repository, son it will be in sync with current versions of
Solr.
- Mensaje original
=text/javascript
src=#{url_for_solr}/js/lib/jquery-1.7.2.min.js/script
On Wed, Sep 25, 2013 at 7:33 PM, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
Try quering the core where the data has been imported, something like:
http://localhost:8983/solr/suggestions/select?q=uc
I think you could use boosting queries: for group A you boost one category and
for group B some other category.
- Mensaje original -
De: Snubbel solrforum.20.x...@spamgourmet.com
Para: solr-user@lucene.apache.org
Enviados: Jueves, 26 de Septiembre 2013 8:01:36
Asunto: Sorting dependent
I've used a separated core for storing suggestions, based on what I see in:
https://github.com/cominvent/autocomplete. You can check the blog post on
www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/. This is
really flexible, on the downside it does not use the suggester
=responseHeaderint name=status0/intint
name=QTime2239/int/lst
/response
Are you able to confirm if this the expected response?
On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
I've used a separated core for storing suggestions, based on what I see
version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime2239/int/lst
/response
Are you able to confirm if this the expected response?
On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
I've used a separated
to
the jquery library but I can't seem to find the directory referenced,
line: script type=text/javascript
src=#{url_for_solr}/admin/jquery-1.4.3.min.js. Do you know where
#{url_for_solr} points to?
On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote
: Implementing Solr Suggester for Autocomplete (multiple columns)
That seems to work. I get back an xml containing a bunch of suggestions.
Can we agree that it's jquery that's the problem?
On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
Try quering
Enviados: Miércoles, 25 de Septiembre 2013 15:40:00
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
Not yet but I do see the $ not found in console.
On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
As far as I can tell
If is query suggestion what you are looking for, what we've done is storing the
user queries into a separated core and pull the suggestions from there.
- Mensaje original -
De: Brendan Grainger brendan.grain...@gmail.com
Para: solr-user@lucene.apache.org
Enviados: Jueves, 13 de Junio
of the query syntax) by the parser.
Providing you're using the edismax parser, it should be just fine for any
other queries, like '+ foo' , 'foo +', '++' ...
J.
On 23 April 2013 15:09, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cuwrote:
Hi Kai:
Thanks for your reply, for what I've
Hi!
Currently I'm working on a basica search engine for, the main problem is that
during some tests a problem was detected, in the application if a user search
for the + or - term only or the + string it causes an exception in my
application, the problem is caused for an
that char in search terms.
Special chars are + - ! ( ) { } [ ] ^ ~ * ? : \ / at the moment.
The %2B is just the url encoding, but it will still be a + for Solr, so just
put a \ in front of the chars I mentioned.
Cheers,
Kai
Am 23.04.2013 um 15:41 schrieb Jorge Luis Betancourt Gonzalez:
Hi
) by the parser.
Providing you're using the edismax parser, it should be just fine for any
other queries, like '+ foo' , 'foo +', '++' ...
J.
On 23 April 2013 15:09, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cuwrote:
Hi Kai:
Thanks for your reply, for what I've understood this logic
Hi all:
I'm building a document search plattform, basically indexing a lot of PDF
files. Some of this files has an index, which means that when I query for
normativos in my application (built using Symfony2+PHP+Solarium) I get a few
results like this:
or so.
-- Jack Krupansky
-Original Message-
From: Jorge Luis Betancourt Gonzalez
Sent: Friday, March 29, 2013 10:34 PM
To: solr-user@lucene.apache.org
Subject: Getting better snippets in highlighting component
Hi all:
I'm building a document search plattform, basically indexing a lot of PDF
I'm using solr 3.6.2 to crawl some data using nutch, in my schema I've one
field with all the content extracted from the page, which could possibly
include email addresses, this is the configuration of my schema:
fieldType name=text class=solr.TextField
would use leading wildcard query.
q=*@gmail.com
There was a similar question recently:
http://search-lucene.com/m/XF2ejnM6Vi2
--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote:
From: Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
Subject: Question about email
Currently I'm using a separated core to query suggestions, for this I've
started from: https://github.com/cominvent/autocomplete. Basically the
suggester component I'm only using it for term suggestions based on the a
tokenized field in my schema (all of this in solr 3.6), perhaps instead of
Agreed, PHP and Solr are an excellent combination. I'm using Solr 3.6 + PHP
(Symfony2 + NelmioSolariumBundle + Solarium) and getting excellent results.
Even solarium as a PHP library is great, right now it lack's of solr4 support,
but for solr 3.6 it's great.
- Mensaje original -
De:
Hi:
I'm trying to build a custom update handler to accomplish one specific task. In
our app we do query suggestions based on previous queries passed into our
frontend app, the thing is that instead of getting this queries from the solr
logs, we stored in a separated core. So far so good, but
Hi:
I'm working on a search engine for several PDF documents, right now one of the
requirements is that we can provide not only the documents matching the search
criteria but the page that match the criteria. Normally tika only extracts the
text content and does not do this distinction, but
fields and will break quickly.
The best way to do it is to index pages as documents. You can use field
collapsing to group pages from the same document together.
Upayavira
On Tue, Feb 5, 2013, at 02:00 PM, Jorge Luis Betancourt Gonzalez wrote:
Hi:
I'm working on a search engine for several PDF
Hi:
I'm currently working with solr 3.6.1, but solr 4 has great features like the
ones bundled with SolrCloud, the content in the index is really not the problem
to the transition, the thing is that I've a large app written in PHP + Solarium
that interacts with the index in solr 3. As far as I
.
Your mileage may vary, but for that particular app, that is what it
took.
Note, 4.0 can work in a 3.x way (old style replication, etc). You don't
need to use SolrCloud etc when using 4.0.
Upayavira
On Sat, Jan 5, 2013, at 08:20 AM, Jorge Luis Betancourt Gonzalez wrote:
Hi:
I'm currently
}
remove_s : { set : null } }
]'
/* example stolen from Yonik's ApacheCon talk */
Upayavira
On Sat, Dec 15, 2012, at 01:34 AM, Jorge Luis Betancourt Gonzalez wrote:
Hi all:
I'm trying to build a query suggestion system using solr (also used to
index all the data in the app). I've a separated
4.0 thing. In order for it to work, you need to store
every field, as what it does behind the scenes is retrieve the stored
fields, rebuilds the document, and then posts the whole document back.
Upayavira
On Sat, Dec 15, 2012, at 04:52 PM, Jorge Luis Betancourt Gonzalez wrote:
Is this updatable
Hi Guillaume:
I beg to differ, it's true that the native solr support has been a big aid to
developers use of solr from many programming languages. But making all the
queries by hand is not wice and in any case is hard to maintain, it's easier
using some OO library to interact with solr. For
Hi:
Is there any way that I can prevent a document from being indexed? I've a
separated core only for query suggestions, this queries are stored right from
the frontend app, so I'm trying to prevent some kind of bad intended queries to
be stored in my query, but keeping the logic of what I
Any news on Solarium Project? Is the one I'm using with Solr 3.6!
- Mensaje original -
De: Bill Au bill.w...@gmail.com
Para: solr-user@lucene.apache.org, Arkadi Colson ark...@smartbit.be
Enviados: Viernes, 7 de Diciembre 2012 13:40:20
Asunto: Re: PHP client
I have not used the pecl Solr
I'm trying to using to search though news websites, but I was interested in
classification on index time, is there any available solution for this?
Greetings!
On Dec 3, 2012, at 12:37 PM, Stanislaw Osinski stanis...@osinski.name wrote:
I mean measuring the similarity between the document in
tokenisation anyhow, as a search for
'universidad' will not match your term 'universidad,'
But you are on the right track - to improve suggestions, improve what is
in your index.
Upayavira
On Mon, Nov 26, 2012, at 07:54 PM, Jorge Luis Betancourt Gonzalez wrote:
Hi:
I've configured my
Hi:
I've configured my solr setup to use the suggester component and to get terms
suggestions from a PHP application, the thing is that I'm getting results like
universidad, note the punctuation sign, is there any way I can get rid of this?
Or do I need to create a separate field and strip all
I'm currently using solarium with solr 3.6, perhaps you can tweak solarium as
needed? I suppose that pull requests are welcome into solarium for solr 4.
Greetings!
On Nov 12, 2012, at 2:56 PM, Bill Au bill.w...@gmail.com wrote:
Anyone know of a PHP client that is compatible with Solr 4.0.0?
I think that solr by him self doesn't store the queries (correct me if I'm
wrong, about this) but you can accomplish what you want by processing the solr
log (its the only way I think). From the solr log you can get the queries and
then process the queries according to your needs, and change
not be sufficient for you.
Upayavira
On Mon, Oct 8, 2012, at 01:24 AM, Jorge Luis Betancourt Gonzalez wrote:
Hi!
I was wondering if there are any built-in mechanism that allow me to
store the queries made to a solr server inside the index itself. I know
that the suggester module exist
Hi!
I was wondering if there are any built-in mechanism that allow me to store the
queries made to a solr server inside the index itself. I know that the
suggester module exist, but as far as I know it only works for terms existing
in the index, and not with queries. I remember reading about
Thanks a lot for all the replies, Chris it worked out with this mm value:
str name=mm
10%
/str
If this version of solr is affected with the bug you pointed out, shouldn't
fail with this value as well?
Greetings!
On Oct 4, 2012, at 8:48 PM, Jorge Luis Betancourt Gonzalez wrote:
Hi Chris
Hi:
I'm having an issue with solr 3.6.1 and I'm sensing that is a lack of
understanding. I'm building a search engine, using of course solr to store the
inverted index, so far so good. When I search for a term, let's say java I
get 761 results, then querying the index with a php term give me
:06 AM, Jorge Luis Betancourt Gonzalez wrote:
Hi:
I'm having an issue with solr 3.6.1 and I'm sensing that is a lack of
understanding. I'm building a search engine, using of course solr to store
the inverted index, so far so good. When I search for a term, let's say
java I get 761
the error Jorge?
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Thu, Oct 4, 2012 at 1:36 PM, Jorge Luis Betancourt Gonzalez
jlbetanco...@uci.cu wrote:
Hi:
Thanks for all the replies, right now I
Thanks for the quick response, I got the same response, what I'm trying to
accomplish is to get straight OR between all the clauses or terms in my query,
the value I should use is 0 right?
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
INFORMATICAS...
CONECTADOS AL FUTURO,
Hi Chris:
I'm using solr 3.6.1, is the bug present in this version?
Greetings!
On Oct 4, 2012, at 6:11 PM, Chris Hostetter wrote:
: GRAVE: java.lang.NumberFormatException: For input string:
: 100
:
: at
84 matches
Mail list logo