oh okay, thx a lot ;)
can i escape all possible operators with a requesthandler ?
or can i escape these operators automatic when the syntax is wrong ?
is use Solr with an php client ^^
MitchK wrote:
According to Ahmet Arslan's Post:
Solr is expecting a word after the !, because it is an
i have the same problem ...
i wrote an email...--
Jonas, did you set the country correctly? If you set it to the US it will
validate against US number formats and not recognize your number in Germany.
but i did not find any option to set my country =(
Janne Majaranta wrote:
Do I need a
Hello,
If I write a custom analyser that accept a specific attribut in the
constructor
public MyCustomAnalyzer(String myAttribute);
Is there a way to dynamically send a value for this attribute from Solr at
index time in the XML Message ?
add
doc
field name=content
Shalin Shekhar Mangar wrote:
On Sat, Feb 27, 2010 at 5:22 PM, Suram reactive...@yahoo.com wrote:
Hi all,
How can i configure Core admin under the Tomcat server.Kindly
could
u tell me anyone
There's nothing to configure. If you are using multiple cores in Solr 1.4
then
Hello,
I came to know that coord() value is being calculated on each
sub-query (BooleanQuery) present in the main query.
For Ex : f-field, k-keyword
(( f1:k1 OR f2:k2) OR f3:k3) OR f4:k4
Here if I am correct, coord() is being calculated totally 3 times. My
goal is to boost (
can i escape all possible operators with a requesthandler?
With a custom one yes. You can use the static method
org.apache.lucene.queryParser.QueryParser.escape(String s).
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
throws Exception, ParseException,
Thanks Mitch, using the analysis page has been a real eye-opener and given
me a better insight into how Solr was applying the filters (and more
importantly in which order). I've ironically ended up with a charFilter
mapping file as this seemed the only route to replacing characters before
the
Hello,
On the solr wiki, here:
http://wiki.apache.org/solr/SolrPerformanceFactors
It is written:
mergeFactor Tradeoffs
High value merge factor (e.g., 25):
Pro: Generally improves indexing speed
Con: Less frequent merges, resulting in a collection with more index
files which may slow
Wonderful! That explains it. Thanks a lot!
Regards,
On Mon, Mar 8, 2010 at 6:39 AM, Jay Hill jayallenh...@gmail.com wrote:
Yes, if omitNorms=true, then no lengthNorm calculation will be done, and
the
fieldNorm value will be 1.0, and lengths of the field in question will not
be a factor in
Hi Mark,
On Sun, Mar 7, 2010 at 6:20 PM, Mark Fletcher
mark.fletcher2...@gmail.comwrote:
I have created 2 identical cores coreX and coreY (both have different
dataDir values, but their index is same).
coreX - always serves the request when a user performs a search.
coreY - the updates will
Hi,
Thank You for explaining it in a simple way.
The article really helped me to understand the concepts better.
My question is ,Is it necessary that the data what you are indexing in
spatial example, is to be in the osm format and using facts files?
In my case,am trying to index data ,that has
On Mon, Mar 8, 2010 at 6:21 PM, KshamaPai kshamapai2...@gmail.com wrote:
Hi,
Thank You for explaining it in a simple way.
The article really helped me to understand the concepts better.
My question is ,Is it necessary that the data what you are indexing in
spatial example, is to be in the
On Mon, Mar 8, 2010 at 5:31 PM, Marc Des Garets marc.desgar...@192.comwrote:
If I have a mergeFactor of 50 when I build the index and then I optimize
the index, I end up with 1 index file so I have a small number of index
files and having used mergeFactor of 50 won't slow searching? Or my
hi,
I am interested in spatial search. I am using Apache-solr 1.4.0 and
LocalSolr
I have followed the instructions given in the following website
http://gissearch.com/localsolr
The query of the following format
/solr/select?qt=geolat=xx.xxlong=yy.yyq=abcradius=zz
(after substituting valid values)
Does anyone know if it's possible to get the position of the highlighted
snippet within the field that's being highlighted?
It would be really useful for me to know if the snippet is at the beginning or
at the end of the text field that it comes from.
Thanks, Mark.
Have you looked in your SOLR log file to see what that says?
Check the editor you use for your XML. is it using UTF-8 (although you
don't appear to be using any odd characters, probably not a problem).
Think about taking the xml file that *does* work, copying it and editing
*that* one.
Erick
I'm uploading .htm files to be extracted - some of these files are include
files that have snippets of HTML rather than fully formed html documents.
solr-cell stores the raw HTML for these items, rather than extracting the text.
Is there any way I can get solr to encode this content prior to
Well, it's not unfortunate G. What would it mean to sort
on a tokenized field? Let's say I index is testing fun. Removing
stopwords and stemming probably indexes test fun. How
in the world would meaningful sorts happen now? Even if
it was in order, since the first token was stopped out this
Perfect. Thank you for your help.
-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: 08 March 2010 12:57
To: solr-user@lucene.apache.org
Subject: Re: question about mergeFactor
On Mon, Mar 8, 2010 at 5:31 PM, Marc Des Garets
marc.desgar...@192.comwrote:
Hi Shalin,
Thank you for the reply.
I got your point. So I understand merge will just duplicate things.
I ran the SWAP command. Now:-
COREX has the dataDir pointing to the updated dataDir of COREY. So COREX has
the latest.
Again, COREY (on which the update regularly runs) is pointing to the old
Hello.
is use 2 cores for solr.
when is restart my tomcat on debian, tomcat delete my index.
is set data.dir to
dataDir${solr.data.dir:./suggest/data}/dataDir
and
dataDir${solr.data.dir:./search/data}/dataDir
cores adminPath=/admin/cores
core name=search instanceDir=search
Hi,
I have started using Solr. I had a problem when I insert a database with
2 million rows . I hav
The server encounters error: java.lang.OutOfMemoryError: Java heap space
I searched around but can't find the solution.
Any hep regarding this will be appreciated.
Thanks in advance
You're probably hitting the difference between *nix file
handling and Windows. When you delete a file on a
Unix variant, if some other program has the file open
the file doesn't go away until that other program closes
it.
HTH
Erick
On Mon, Mar 8, 2010 at 9:08 AM, stocki st...@shopgate.com wrote:
I had same issue with Jetty
Adding extra memory resolved my issue ie: java -Xms=512M -Xmx=1024M -jar
start.jar
Its in the manual, but cant seem to find the link
On 8 Mar 2010, at 14:09, Quan Nguyen Anh wrote:
Hi,
I have started using Solr. I had a problem when I insert a database with 2
Am 08.03.2010 15:08, schrieb stocki:
Hello.
is use 2 cores for solr.
when is restart my tomcat on debian, tomcat delete my index.
you should check your tomcat-setup.
is set data.dir to
dataDir${solr.data.dir:./suggest/data}/dataDir
and
dataDir${solr.data.dir:./search/data}/dataDir
Hi Mark,
On Mon, Mar 8, 2010 at 7:38 PM, Mark Fletcher
mark.fletcher2...@gmail.comwrote:
I ran the SWAP command. Now:-
COREX has the dataDir pointing to the updated dataDir of COREY. So COREX
has the latest.
Again, COREY (on which the update regularly runs) is pointing to the old
index of
Hi,
is anybody willing to share experience about how to extract content from
mailing list archives in order to have it indexed by Lucene or Solr?
Imagine that we have access to archive of some mailling list (e.g.
http://www.mail-archive.com/mailman-users%40python.org/) and we would like
to index
I just checked popular search services and it seems that neither
lucidimagination search nor search-lucene support this:
http://www.lucidimagination.com/search/document/954e8589ebbc4b16/terminating_slashes_in_url_normalization
All,
So I think I have my first issue figured out, need to add terms to the
default search. That's fine.
New issue is that I'm trying to load child entities in with my entity.
I added the appropriate fields to solrconfig.xml
field name=sections type=string indexed=true stored=true
you only just delete your browser cache ;)
stocki wrote:
i have the same problem ...
i wrote an email...--
Jonas, did you set the country correctly? If you set it to the US it will
validate against US number formats and not recognize your number in
Germany.
but i did not find any
Where would I see this? I do believe the fields are not ending up in the
index.
Thanks
John
On Mon, Mar 8, 2010 at 10:34 AM, Erick Erickson erickerick...@gmail.comwrote:
What does the solr admin page show you is actually in your index?
Luke will also help.
Erick
On Mon, Mar 8, 2010 at
Hi Shalin,
Thank you for the mail.
My main purpose of having 2 identical cores
COREX - always serves user request
COREY - every day once, takes the updates/latest data and passess it on to
COREX.
is:-
Suppose say I have only one COREY and suppose a request comes to COREY while
the update of the
okay i install my solr so like how the wiki said. and a new try. here one of
my two files:
Context docBase=/var/lib/tomcat5.5/solr.war debug=0 crossContext=true
Environment name=solr/home type=java.lang.String
value=/home/sites/my/path/to/Solr/home/cores/suggest override=true
I'm encountering a potential bug in Solr regarding wildcards. I have two
fields defined thusly:
!-- A general unstemmed text field - good if one does not know the
language of the field --
fieldType name=textgen class=solr.TextField
positionIncrementGap=100
analyzer type=index
What database are you using? Many of the JDBC drivers try to pull the
entire resultset into RAM before feeding it to the application that
requested the data. If it's MySQL, I can show you how to fix it. The
batchSize parameter below tells it to stream the data rather than buffer
it. With
Hi Mark,
On Mon, Mar 8, 2010 at 9:23 PM, Mark Fletcher
mark.fletcher2...@gmail.comwrote:
My main purpose of having 2 identical cores
COREX - always serves user request
COREY - every day once, takes the updates/latest data and passess it on to
COREX.
is:-
Suppose say I have only one COREY
On 03/08/2010 10:53 AM, Mark Fletcher wrote:
Hi Shalin,
Thank you for the mail.
My main purpose of having 2 identical cores
COREX - always serves user request
COREY - every day once, takes the updates/latest data and passess it on to
COREX.
is:-
Suppose say I have only one COREY and suppose a
Try http://localhost address and port/solr/admin. You'll see a bunch
of links that'll allow you to examine many aspects of your installation.
Additionally, get a copy of Luke (Google Lucene Luke) and point it at
your index for a detailed look at the index.
Finally, the SOLR log file might give
Erick,
I'm sorry, but it's not helping much. I don't see anything on the admin
screen that allows me to browse my index. Even using Luke, my assumption is
that it's not loading correctly in the index. What parameters can I change
in the logs to make it print out more information? I want to see
Sorry, won't be able to really look till tonight. Did you try Luke? What did
it
show?
One thing I did notice though...
field name=sections type=string indexed=true stored=true
multiValued=true/
string types are not analyzed, so the entire input is indexed as
a single token. You might want text
query:
spell?q=name:(cm*) OR namesimple:(cm*)
returns:
CMJ foo bar
CME foo bar
spell?q=name:(CM*) OR namesimple:(CM*)
returns
No results.
Wildcard queries are not analyzed by Lucene and hence the behavior. [1]
Another thing I don't get. The system feels like it's doing the extra
queries. I put the LogTransformer expecting to see additional output on one
of the child entities
entity name=section_product query=select section_title as sections from
section s, section_product sp where sp.section_id =
The issue's not about indexing, the issue's about storage. It seems like
the fields (sections, colors, sizes) are all not being stored, even though
store=true.
I could not get Luke to work, no. The webstart just hangs at downloading
0%.
Thanks,
John
On Mon, Mar 8, 2010 at 12:06 PM, Erick
Ok - downloaded the binary off of google code and it's loading. The 3 child
entities do not appear as I had suspected.
Thanks,
John
On Mon, Mar 8, 2010 at 12:12 PM, John Ament my.repr...@gmail.com wrote:
The issue's not about indexing, the issue's about storage. It seems like
the fields
If my query were something like this: select col1, col2 from table, my
dynamic field would be something like fld_${col1}. But I could not find any
information on how to setup the DIH with dynamic fields. I saw that dynamic
fields should be supported with SOLR-742, but am not sure how to
Shawn,
Increasing the fetch size and increasing my heap based on that did the
trick.. Thanksss a lot for your help.. your suggestions helped me a lot..
Hope these suggestions will be helpful to others too who are facing similar
kind of issue.
Thanks,
Barani
Shawn Heisey-4 wrote:
Do keep
Hi,
We have some dynamic fields getting indexed using SOLR. Some of the dynamic
fields contains spaces / special character (something like: short name, Full
Name etc...). Is there a way to search on these fields (which contains the
spaces etc..). Can someone let me know the filter I need to pass
I do not believe the SOLR or LUCENE syntax allows this
You need to get rid of all the spaces in the field name
If not, then you will be searching for short in the default field and then
name1 in the name field.
http://wiki.apache.org/solr/SolrQuerySyntax
All
It seems like my issue is simply on the concept of child entities.
I had to add a second table to my query to pull pricing info. At first, I
was putting it in a separate entity. Didn't work, even though I added the
fields.
When I rewrote my query as
entity name=product query=select
Good afternoon.
We have been experiencing an odd issue with one of our Solr nodes. Upon startup
or when bringing in a new index we get a CPU spike for 5 minutes or so. I have
attached a graph of this spike. During this time simple queries return without
a problem but more complex queries do
Is this just autowarming?
Check your autowarmCount parameters in solrconfig.xml
-Yonik
http://www.lucidimagination.com
On Mon, Mar 8, 2010 at 5:37 PM, John Williams j...@37signals.com wrote:
Good afternoon.
We have been experiencing an odd issue with one of our Solr nodes. Upon
startup or
Yonik,
In all cases our autowarmCount is set to 0. Also, here is a link to our
config. http://pastebin.com/iUgruqPd
Thanks,
John
--
John Williams
System Administrator
37signals
On Mar 8, 2010, at 4:44 PM, Yonik Seeley wrote:
Is this just autowarming?
Check your autowarmCount parameters in
On Mon, Mar 8, 2010 at 6:07 PM, John Williams j...@37signals.com wrote:
Yonik,
In all cases our autowarmCount is set to 0. Also, here is a link to our
config. http://pastebin.com/iUgruqPd
Weird... on a quick glance, I don't see anything in your config that
would cause work to be done on a
Hi,
Posting arabic pdf files to Solr using a web form (to solr/update/extract)
get extracted texts and each words displayed in reverse direction(instead of
right to left).
When perform search against these texts with -always- reversed key-words I
get results but reversed.
This problem doesn't
Too bad it requires integer (long) primary keys... :/
2010/3/8 Ian Holsman li...@holsman.net
I just saw this on twitter, and thought you guys would be interested.. I
haven't tried it, but it looks interesting.
http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin
Thanks for the
I think the problem is that Solr does not include the ICU4J jar, so it
won't work with Arabic PDF files.
Try putting ICU4J 3.8 (http://site.icu-project.org/download) in your classpath.
On Mon, Mar 8, 2010 at 6:30 PM, Abdelhamid ABID aeh.a...@gmail.com wrote:
Hi,
Posting arabic pdf files to
Hi,
During indexing its taking localhost and port 8983,
index:
[echo] Indexing ./data/
[java] ./data/ http://localhost:8983/solr
So other case where in solr instance not running ,what may be the reason
that solr is not running? (Am new to solr)
You mean it has to do nothing with
waitFlush=true means that the commit HTTP call waits until everything
is sent to disk before it returns.
waitSearcher=true means that the commit HTTP call waits until Solr has
reloaded the index and is ready to search against it. (For more, study
Solr warming up.)
Both of these mean that the HTTP
On 3/8/2010 9:21 PM, Lee Smith wrote:
I had same issue with Jetty
Adding extra memory resolved my issue ie: java -Xms=512M -Xmx=1024M -jar
start.jar
Its in the manual, but cant seem to find the link
On 8 Mar 2010, at 14:09, Quan Nguyen Anh wrote:
Hi,
I have started using Solr. I had
On 3/8/2010 11:05 PM, Shawn Heisey wrote:
What database are you using? Many of the JDBC drivers try to pull the
entire resultset into RAM before feeding it to the application that
requested the data. If it's MySQL, I can show you how to fix it. The
batchSize parameter below tells it to
On 3/8/2010 11:05 PM, Shawn Heisey wrote:
What database are you using? Many of the JDBC drivers try to pull the
entire resultset into RAM before feeding it to the application that
requested the data. If it's MySQL, I can show you how to fix it. The
batchSize parameter below tells it to
... curl http://xen1.xcski.com:8080/solrChunk/nutch/select
that should be /update, not /select
On Sun, Mar 7, 2010 at 4:32 PM, Paul Tomblin ptomb...@xcski.com wrote:
On Tue, Mar 2, 2010 at 1:22 AM, Lance Norskog goks...@gmail.com wrote:
On Mon, Mar 1, 2010 at 4:02 PM, Paul Tomblin
I'm starting to learn Soir/Lucene. I'm working on a shared server and have to
use a stand alone Java install. Anyone tell me how to install OpenJDK for a
shared server account?
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot,
This is an interesting idea. There are other projects to make the
analyzer/filter chain more porous, or open to outside interaction.
A big problem is that queries are analyzed, too. If you want to give
the same metadata to the analyzer when doing a query against the
field, things get tough. You
Isn't this what Lucene/Solr payloads are theoretically for?
ie:
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
- Jon
On Mar 8, 2010, at 11:15 PM, Lance Norskog wrote:
This is an interesting idea. There are other projects to make the
analyzer/filter chain more
A Tika integration with the DataImportHandler is in the Solr trunk.
With this, you can copy the raw HTML into different fields and process
one copy with Tika.
If it's just straight HTML, would the HTMLStripCharFilter be good enough?
http://www.lucidimagination.com/search/document/CDRG_ch05_5.7.2
Solr unique ids can be any type. The QueryElevateComponent complains
if the unique id is not a string, but you can comment out the QEC. I
have one benchmark test with 2 billion documents with an integer id.
Works great.
On Mon, Mar 8, 2010 at 5:06 PM, Don Werve d...@madwombat.com wrote:
Too bad
Yes, payloads should do this.
On Mon, Mar 8, 2010 at 8:29 PM, Jon Baer jonb...@gmail.com wrote:
Isn't this what Lucene/Solr payloads are theoretically for?
ie:
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
- Jon
On Mar 8, 2010, at 11:15 PM, Lance Norskog
Is this a mistake in the Tika library collection in the Solr trunk?
On Mon, Mar 8, 2010 at 5:15 PM, Robert Muir rcm...@gmail.com wrote:
I think the problem is that Solr does not include the ICU4J jar, so it
won't work with Arabic PDF files.
Try putting ICU4J 3.8
it is an optional dependency of PDFBox. If ICU is available, then it
is capable of processing Arabic PDF files.
The problem is that Arabic text in PDF files is really glyphs
(encoded in visual order) and needs to be 'unshaped' with some stuff
that isn't in the JDK.
If the size of the default ICU
Using Solr 1.4.
Was using the standard query handler, but needed the boost by field
functionality of qf from dismax.
So we altered the query to boost certain phrases against a given field.
We were using QueryElevationComponent (elevator from solrconfig.xml)
for one particular entry we wanted
Maybe some things to try:
* make sure your uniqueKey is string field type (ie if using int it will not
work)
* forceElevation to true (if sorting)
- Jon
On Mar 9, 2010, at 12:34 AM, Ryan Grange wrote:
Using Solr 1.4.
Was using the standard query handler, but needed the boost by field
It is true I need also this metadata at query time. For the moment, I put
this extra information at the beginning of the data too be indexed and at
the beginning of the query. It works, but I really don't like this. In my
case, I need the language of the data to be index and the language of the
73 matches
Mail list logo