On Thu, May 5, 2011 at 5:08 AM, Leonardo Souza leonardo...@gmail.com wrote:
Hi guys,
Can i have a field name with a period(.) ?
Like in *file.size*
Cannot find now where this is documented, but from what I remember it is
recommended to use only characters A-Z, a-z, 0-9, and underscore (_) in
another question
if i define different fields with different boosts and then copy them into
another field and make a search by using this universal field, the boosting
will be done?
--
View this message in context:
http://lucene.472066.n3.nabble.com/copyField-tp2902242p2902242.html
Sent from
Hello deniz,
You could create a new field say FullName which is a copyfield of
firstname and surname. Search on both the new field and location but boost
up the new field query.
Regards
Aditya
www.findbestopensource.com
On Thu, May 5, 2011 at 9:21 AM, deniz denizdurmu...@gmail.com wrote:
I am asking specifically because I am wondering if it is worth my time
too read the Enterprise server book or if there is too much of a
branch between the two?
If I read the book are there any parts of the book specifically that
won't be relevant?
Thanks,
Bryan Rasmussen
Hello,
thanks for the answers, i use branch 1.4 and i have succesfully patch
solr-2010.
Now i want to use the collate spellchecking. How does my url look like. I
tried this but
it's not working(It's the same as solr without solr-2010).
http://localhost:8983/solr/select?q=man
Does the solr enable lemmatization concept?
I found a documentation that gives an information as solr enables
lemmatization concept. Here is the link :
http://www.basistech.com/knowledge-center/search/2010-09-language-identification-language-support-and-entity-extraction.pdf
Can anyone help
Justine,
The JSON update request handler was added in Solr 3.1. Please download this
version and try again.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 3. mai 2011, at 22.34, Justine Mathews wrote:
Hi,
When I have add the Json request handler as below for
if i define different fields with different boosts and then
copy them into
another field and make a search by using this universal
field, the boosting
will be done?
No. copyField just copies raw content.
Hi,
Solr IS an enterprise search server. And there is only one edition :)
I'd wait a few more weeks until the Solr 3.1 books are available, and then read
up on it.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 5. mai 2011, at 09.37, bryan rasmussen wrote:
I am
Hi,
Solr does not have lemmatization out of the box.
You'll have to find 3rd party analyzers, and the most known such is from
BasisTech. Please contact them to learn more.
I'm not aware of any open source lemmatizers for Solr.
--
Jan Høydahl, search solution architect
Cominvent AS -
ok, I just saw the thing about syncing the version numbers.
Is there any information on these Solr 3.1 books? Publishers,
publication dates, website on them?
Mvh,
Bryan Rasmussen
On Thu, May 5, 2011 at 10:57 AM, Jan Høydahl jan@cominvent.com wrote:
Hi,
Solr IS an enterprise search
Hello,
It's final in the trunk, and has always been since conception in 2006 at
revision 372455. Why?
--
Regards,
K. Gabriele
--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
Hi,
I have to index records that have fields containing date.
This date can be : 2011, 2011-05, 2015-05-01. Trailing characters also
can be slashes.
I'd like to convert theses values into a valid date for Solr.
So my question is : what is the best way to achieve this?
1) Use solr.DateField and
Hi
I can load all indexed data using /select request and query param as *:*.
I tried same with /Search request but it didn't work. Even it didn't work
for * as query value. I am using disMax handler. Is it possible to load
all indexed data in search and suggest request?
--
View this message
On Thu, May 5, 2011 at 3:48 PM, Kannan ramkannan2...@gmail.com wrote:
Hi
I can load all indexed data using /select request and query param as *:*.
I tried same with /Search request but it didn't work. Even it didn't work
for * as query value. I am using disMax handler. Is it possible to
I am using disMax handler. Is it
possible to load
all indexed data in search and suggest request?
With dismax, you can use q.alt=*:* parameter. Don't use q parameter at all.
--- On Thu, 5/5/11, Marc SCHNEIDER marc.schneide...@gmail.com wrote:
From: Marc SCHNEIDER marc.schneide...@gmail.com
Subject: Format date before indexing it
To: solr-user solr-user@lucene.apache.org
Date: Thursday, May 5, 2011, 12:51 PM
Hi,
I have to index records that have fields
Rajani
You might also want to look at Balie ( http://balie.sourceforge.net/ ), from
the web site:
Features:
• language identification
• tokenization
• sentence boundary detection
• named-entity recognition
Can't vouch for it though.
On May 5, 2011, at 4:58
Hi all,
We’re really proud to release the first official major release of Lily
- our flagship repository for scalable data and content management,
after 18 months of intense engineering work. We’re thrilled being
first to launch the first open source, general-purpose,
highly-scalable yet flexible
Dear Solr Experts,
First of all, I would like to thank you for your patience when answering
questions of those who are less experienced.
And now to the main topic: I would like to learn whether it is possible
to restructure a Solr cloud programmatically.
Let me describe the system we are
Unfortunately, the current out-of-the-box defaults (example config)
for Solr are a disaster for non-whitespace languages (CJK, Thai,
etc.), ie, exactly what you've hit.
This is because Lucene's QueryParser can unexpectedly, dangerously,
create PhraseQuery even when the user did not ask for it
I've tried to re-install solr on tomcat, and now when I launch tomcat in
debug mode I see the following exception relating to solr. It's not enough
to understand the problem (and fix it), but I don't know where to look for
more (or what to do). Please help me.
Following the tutorial and
Hi,
One approach if you're using Amazon is using BeanStalk
* Create one master with 12 cores, named jan, feb, mar etc
* Every month, you clear the current month index and switch indexing to it
You will only have one master, because you're only indexing to one month at a
time
* For each of the
There are two ways to characterize what I'd like to do.
1) use the EmbeddedSolrServer to launch Solr, and subsequently enable
the HTTP GET/json servlet. I can provide the 'servlet' wiring, I just
need to be able to hand an HttpServletRequest to something and
retrieve in return the same json that
Hi.
I need an autocomplete solution to handle case-insensitive queries but
return the original text with the case still intact. I've experimented
with both the Suggester and TermComponent methods. TermComponent is working
when I use the regex option, however, it is far to slow. I get the
Hi Gabriele,
The sequence should be
1. svn update
2. ant get-maven-poms
3. mvn -N -Pbootstrap install
I think you left out #2 - there was a very recent change to the POMs that
affects the noggit jar name.
Steve
-Original Message-
From: Gabriele Kahlout
Hi All,
I have solr and tika installed and am happily extracting and indexing
various files.
Unfortunately on some word documents it blows up since it tries to
auto-generate a 'title' field but my title field in the schema is single
valued.
Here is my config for the extract handler...
Hi Emyr,
You could try using the extractOnly=true parameter [1]. Of course,
you'll need to repost the extracted text manually.
--jay
[1] http://wiki.apache.org/solr/ExtractingRequestHandler#Extract_Only
On Thu, May 5, 2011 at 9:36 AM, Emyr James emyr.ja...@sussex.ac.uk wrote:
Hi All,
I
Okay, that sequence worked, but then shouldn't I be able to do $ mvn install
afterwards? This is what I get:
...
Compiling 478 source files to /Users/simpatico/debug/solr4/solr/build/solr
-
COMPILATION ERROR :
Thanks for the suggestion but there surely must be a better way than
that to do it ?
I don't want to post the whole file up, get it extracted on the server,
send the extracted text back to the client then send it all back up to
the server again as plain text.
On 05/05/11 14:55, Jay Luker
2011/5/5 Michael McCandless luc...@mikemccandless.com:
The very first thing every non-whitespace language Solr app should do
is turn off autoGeneratePhraseQueries!
Luckily, this is configurable per FieldType... so if it doesn't exist
yet, we should come up with a good
CJK fieldtype to add to
Hi Emyr,
You can try the XPath based approach and see if that works. Also, see if
dynamic fields can help you for the meta data fields.
References-
http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
http://wiki.apache.org/solr/ExtractingRequestHandler#Input_Parameters
Thanks Gora!
[ ]'s
Leonardo da S. Souza
°v° Linux user #375225
/(_)\ http://counter.li.org/
^ ^
On Thu, May 5, 2011 at 3:09 AM, Gora Mohanty g...@mimirtech.com wrote:
On Thu, May 5, 2011 at 5:08 AM, Leonardo Souza leonardo...@gmail.com
wrote:
Hi guys,
Can i have a field name
There is still a functionality gap in Solr's spellchecker even with Solr-2010
applied. If a user enters a word that is in the dictionary, solr will never
try to correct it. The only way around this is to use
spellcheck.onlyMorePopular. The problem with this approach is
onlyMorePopular
Thanks for the suggestion, Peter;
the problem was elsewhere though - somewhere in the highlighting
module.
I've fixed it by adding (into the field definition in schema.xml) a
custom czech charFilter (mappings from í = i) - then it started to
work as expected.
Cheers,
Pavel
Hi,
I'm not really sure how these can help with my problem. Can you give a
bit more info on this ?
I think what i'm after is a fairly common request..
http://lucene.472066.n3.nabble.com/Controlling-Tika-s-metadata-td2378677.html
Hey Emyr,
Looking at your stack trace below my guess is that you have two conflicting
Apache POI jars in your classpath. The odd stack trace is indicative of that as
the class loader is likely loading some other version of the DirectoryNode
class that doesn't have the iterator method.
Tommaso,
Thanks. Now Solr finds the descriptor; however, I think this is very bad
practice.
Descriptors really aren't meant to be jarred up. They often contain
relative paths.
For example, in my case I have a directory that looks like:
appassemble
|- desc
|- pear
where
While the question remains valid, I found there reason to my problem.
Backing up I had saved Tomcat's descriptor file in my $SOLR_HOME and Solr
was trying to read it as described in SolrCore
Wikihttp://wiki.apache.org/solr/CoreAdmin
.
What saved me was remembering Chris's earlier
Hi,
(Sorry, emailing again because the last post was not posted...)
I have been using using SolrSpellCheckcomponent. One of my requirements is
that if a user types something like add, solr would return adidas. To
get something like this, I used EdgeNGramsFilterFactory and applied it to
the
Hi,
Try this solution using a Solr core:
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
On 5. mai 2011, at 15.22, Kusenda, Brandyn J wrote:
Hi.
I need an
Hi Gabriele,
On 5/5/2011 at 9:57 AM, Gabriele Kahlout wrote:
Okay, that sequence worked, but then shouldn't I be able to do $ mvn
install afterwards? This is what I get:
...
COMPILATION ERROR :
-
Hi,
I bench-marked the slow stats queries (6 point estimate) using the same
hardware on an index of size 104M. We use a Solr/Lucene 3.1-mod which
returns only the sum and count for statistics component results. Solr/Lucene
is run on jetty.
The relationship between query time and set of found
Thanks Steve, this will be really simpler next time :)
Is it documented somewhere ? If no, perhaps could we add something in this
page for example ?
http://wiki.apache.org/solr/FrontPage#Solr_Development
or here :
http://wiki.apache.org/solr/NightlyBuilds
Ludovic.
2011/5/5 steve_rowe [via
Steven, thank you!
$ mvn -DskipTests=true install
works!
[INFO] Reactor Summary:
[INFO]
[INFO] Grandparent POM for Apache Lucene Java and Apache Solr SUCCESS
[13.142s]
[INFO] Lucene parent POM . SUCCESS [0.345s]
[INFO] Lucene Core
Hello,
I'm using solr version 1.4.0 with tomcat 6. I've 2 solr instances running as
2 different web apps with separate data folders. My application requires
frequent commits from multiple clients. I've noticed that when more than one
client try to commit at the same time, these
I’m using Data Import Handler for index emails.
The problem is that I wanna add my own field such as security_number.
Someone have any idea?
Regards,
--
James Bond Fang
I’m using Data Import Handler for index emails.
The problem is that I wanna add my own field such as security_number.
Someone have any idea?
Regards,
--
James Bond Fang
I’m using Data Import Handler for index emails.
The problem is that I wanna add my own field such as security_number.
Someone have any idea?
Regards,
Jame Bond Fang
The best way to add your own fields is to create a custom Transformer sub-class.
See:
http://www.lucidimagination.com/search/out?u=http%3A%2F%2Fwiki.apache.org%2Fsolr%2FDataImportHandler
This will guide you through the steps.
Peter
2011/5/5 方振鹏 michong900...@xmu.edu.cn:
I’m using Data
: $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
: queryResponseWriter name=xml class=org.apache.solr.request.*
: XMLResponseWriter* default=true/
:
: Now I comment the line in Solrconfix.xml, and there's no more writer.
: $ xmlstarlet sel -t -c
{quote}
...
Caused by: java.io.EOFException: Can not read response from server.
Expected to read 4 bytes, read 0 bytes before connection was
unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2539)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2989)
You're welcome, I'm glad you got it to work. - Steve
-Original Message-
From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
Sent: Thursday, May 05, 2011 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Is it possible to build Solr as a maven project?
Steven, thank you!
Hi Sid,
unfortunately not and as far as I know it is not possible to realize your
requirements with Solr's SpellCheck-Packages (I talk about V. 1.4, since
there are some changes in 3.1).
Regards,
Em
--
View this message in context:
Hi,
I am new to solr and this is my first attempt at indexing solr data, I am
getting the following exception while indexing,
org.apache.solr.common.SolrException: Invalid Date String:'2011-01-07' at
org.apache.solr.schema.DateField.parseMath(DateField.java:165) at
I've now tried to write my own QueryResponseWriter plugin[1], as a maven
project depending on Solr Core 3.1, which is the same version of Solr I've
installed. It seems I'm not able to get rid of some cache.
$ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
Just for the reference.
$ svn update
At revision 1099940.
On Thu, May 5, 2011 at 9:14 PM, Steven A Rowe sar...@syr.edu wrote:
You're welcome, I'm glad you got it to work. - Steve
-Original Message-
From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
Sent: Thursday, May 05,
Hi,
Sorry for the possible double post, I wrote this up but had the
incorrect sender address, so I am guessing that my previous one is going
to be rejected by the list moderation daemon.
I am trying to figure out options for the following problem. I am on
Solr 1.4.1 (Lucene 2.9.1).
I have
--- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote:
From: Sujit Pal sujit@comcast.net
Subject: Custom sorting based on external (database) data
To: solr-user solr-user@lucene.apache.org
Date: Thursday, May 5, 2011, 11:03 PM
Hi,
Sorry for the possible double post, I wrote this
Thank you Ahmet, looks like we could use this. Basically we would do
periodic dumps of the (unique_id|computed_score) sorted by score and
write it out to this file followed by a commit.
Found some more info here, for the benefit of others looking for
something similar:
Hi guys,
another question on custom search components:
Is there any way to force the response to be 0 results from within a search
component (and break out of the component chain)?
I'm doing some checks in my first-component and in some cases would like to
stop processing the request and just
Hi,
I haven't used Suggester yet, but couldn't you feed it all lowercase content
and
then lowercase whatever the user is typing before sending it to Suggester to
avoid case mismatch?
Autocomplete on http://search-lucene.com/ uses
http://sematext.com/products/autocomplete/index.html if you
Is there any way to force the response to be 0 results
from within a search component (and break out of the
component chain)?
I'm doing some checks in my first-component and in some
cases would like to stop processing the request and just
pretend, that there are 0 results ...
Yes. You can
Nice, it works like a charm.
I am using solr 1.4.1. Here is my configuration for the chinese field:
fieldType name=text_ch class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.ChineseTokenizerFactory/
/analyzer
analyzer type=query
Hi,
I'd like to solicit your thoughts about Search Analytics if you are doing any
sort of analysis/reporting of search logs or click stream or anything related.
* Which information or reports do you find the most useful and why?
* Which reports would you like to have, but don't have for
What's the probability that I can build a non-trivial Solr app without writing
any Java?
I've been planning to use Solr, Lucene, and existing plug-ins, and sort of
hoping not to write any Java (the app itself is Ruby / Rails). The dox (such as
http://wiki.apache.org/solr/FAQ) seem encouraging.
org.apache.solr.common.SolrException: Invalid Date
String:'2011-01-07' at
org.apache.solr.schema.DateField.parseMath(DateField.java:165)
Solr accepts date in the following format: 2011-01-07T00:00:00Z
I understand from reading some articles that Solr stores
time only in UTC,
this is the
Short answer: Yes, you can deploy a Solr cluster and write an application that
talks to it without writing any Java (but it may be PHP or Python or unless
that application is you typing telnet my-solr-server 8983 )
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
When I ran the search engine at Feedster, I wrote a perl script that ran
nightly and gave me:
total number of searches
total number of searches per hour
N most frequent searches
max time for a search
min time for a search
mean time for searches
median time for searches
N slowest searches
Rohit,
The solr server using TrieDateField must receive values in the format
2011-01-07T17:00:30Z
This should be a UTC-based datetime. The offset can be applied once you get
your results back from solr
SimpleDateFormat df = new SimpleDateFormat(format);
H, this is puzzling. If you could come up with a couple of xml
files and a schema
that illustrate this, I'll see what I can see...
Thanks,
Erick
On Wed, May 4, 2011 at 7:05 PM, Viswa S svis...@hotmail.com wrote:
Erik,
I suspected the same, and setup a test instance to reproduce this. The
For a truly universal field, I'm not at all sure how you'd proceed. But if you
know what your sub-fields are in advance, have you considered just making
them regular fields and them throwing (d)dismax at it?
Best
Erick
On Wed, May 4, 2011 at 11:51 PM, deniz denizdurmu...@gmail.com wrote:
I remember the same, except I think I've seen the recommendation that you
make all the letters lower-case. As I remember, there are some interesting
edge cases that you might run into later with upper case.
But I can't remember the specifics either
Erick
On Thu, May 5, 2011 at 10:08 AM,
Hi,
After upgrading from Solr 1.4.0 to 3.1, are highlighting has gone from
highlighting short pieces of text to displaying what appears to be the entire
contents of the highlighted field.
The request using solrj is setting the following:
params.setHighlight(true);
Please find attached the schema and some test data (test.xml).
Thanks for looking this.
Viswa
Date: Thu, 5 May 2011 19:08:31 -0400
Subject: Re: Solr Terms and Date field issues
From: erickerick...@gmail.com
To: solr-user@lucene.apache.org
H, this is puzzling. If you could come up
It is okey to see weird things in admin/schema.jsp or terms component with trie
based types. Please see http://search-lucene.com/m/WEfSI1Yi4562/
If you really need terms component, consider using copyField (tdate to string
type)
Please find attached the schema and some test data
I am running into this problem as well, but only sporadically, and only
in my 3.1 test environment, not 1.4.1 production. I may have narrowed
things down, I am interested now in learning whether this is a problem
with the MySQL connector or DIH.
On 4/21/2011 6:09 PM, Scott Bigelow wrote:
Alex, thanks for your response. I suspect you're right about
autoCommit; i ended up solving the problem by merely moving the entire
Solr install, untouched, to a significantly larger instance (EC2
m1.small to m1.large). I think it is appropriately sized now for the
quantity and intensity of
Yeah you don't need Java to use Solr. PHP, Curl, Python, HTTP Request
APIs all work fine.
The purpose of Solr is to wrap Lucene into a REST-like API that anyone
can call using HTTP.
On Thu, May 5, 2011 at 4:35 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
Short answer: Yes, you can
Are you giving that solution away? What is the costs? etc!!
On Thu, May 5, 2011 at 2:58 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
Hi,
I haven't used Suggester yet, but couldn't you feed it all lowercase content
and
then lowercase whatever the user is typing before sending it
Is there a parser that can take a string and tell you what part is an
address, and what is not?
Split the field into 2 fields?
Search: Dr. Bell in Denver, CO
Search: Dr. Smith near 10722 Main St, Denver, CO
Search: Denver, CO for Cardiologist
Thoughts?
2011/5/5 François Schiettecatte
I am using DIH with the MySQL connector to import data into my index.
When doing a full import in my 3.1 test environment, it sometimes loses
connection with the database and ends up rolling back the import. My
import configuration uses a single query, so there's no possibility of a
Hi Craig,
Thanks for the response, actually what we need to achive is see group by
results based on dates like,
2011-01-01 23
2011-01-02 14
2011-01-03 40
2011-01-04 10
Now the records in my table run into millions, grouping the result based on
UTC date would not produce the right result
83 matches
Mail list logo