Thank you.
This work good as workaround. Yesterday I get the Tipp to look for wrong
solrconfig.xml and that was right.
By uploading our Files the solrconfig.xml was LOST ;-)
Is it possible to start Java in Debugmode for more Infos?
David
Am 16.03.2010 02:02, schrieb Tom Hill:
You need a
Shalin Shekhar Mangar wrote:
On Sat, Mar 13, 2010 at 9:30 AM, Suram reactive...@yahoo.com wrote:
Erick Erickson wrote:
Did you commit your changes?
Erick
On Fri, Mar 12, 2010 at 7:38 AM, Suram reactive...@yahoo.com wrote:
Can set my index fields for auto Suggestion,
If you're going to spend time mucking w/ TermPositions, you should just spend
your time working with SpanQuery, as that is what I understand you to be asking
about. AIUI, you want to be able to get at the positions in the document where
the query matched. This is exactly what a SpanQuery and
On Mar 15, 2010, at 11:36 AM, Jean-Sebastien Vachon wrote:
Hi All,
I'm trying to figure out how to perform spatial searches using Solr 1.5 (from
the trunk).
Is the support for spatial search built-in?
Almost. Main thing missing right now is filtering. There are still ways to do
This is my first post on this list -- apologies if this has been discussed
before; I didn't come upon anything exactly equivalent in searching the
archives via Google.
I'm using Solr 1.4 as part of the VuFind application, and I just noticed that
searches for hyphenated terms are failing in
Hi,
According to the wiki its possible to pass parameters to the DIH:
http://wiki.apache.org/solr/DataImportHandler#Accessing_request_parameters
I assume they are just being replaced via simple string replacements, which is
exactly what I need. Can they also be in all places, even attributes
Hi,
I am trying to use $deleteDocById to delete rows based on an SQL query in my
db-data-config.xml. The following tag is a top level tag in the document tag.
entity name=company_del query=SELECT e.id AS `$deleteDocById` ROM
deletedentity AS e/
However it seems like its only fetching
Hi again ,
I just came from trying the version 1.5-dev from Solr trunk.
After applying the patch you provided, and adding icu4j-3_8_1 in classpath,
results are pretty good different then before.
Now words and texts are not reversed and are displayed correctly except some
pdf files's text parts
I generate solr index on an hadoop cluster and I want to copy it from HDFS to
a server running solr.
I wish to copy the index on a different disk than the disk that solr
instance is using, then tell the solr server to switch from the current data
dir to the location where I copied the hadoop
Most of our documents will be in English but not all and we are certain in
the process of acquiring more international content. Does anyone have any
experience using all of the different stemmers for languages of unknown
origin? Which ones perform the best? Give the most relevant results? What
I used it mostly for KStemmer, but I also liked the fact that it included about
a dozen or so stable patches since Solr 1.4 was released. We just use the
included WAR in our project however. We don't use the installer or anything
like that.
From: blargy
I'm trying it out right now. I hope it will work well out-of-box for
indexing/searching a set of documents with frequent update.
-aj
On Tue, Mar 16, 2010 at 11:52 AM, blargy zman...@hotmail.com wrote:
Has anyone used this?:
http://www.lucidimagination.com/Downloads/LucidWorks-for-Solr
Other
If you search the mail archive, you'll find many discussions of
multilingual indexing/searching that'll provide you a plethora
of information.
But the synopsis as I remember is that using a single stemmer for
multiple languages is generally a bad idea
Best
Erick
On Tue, Mar 16, 2010 at
I am working on an application that currently hits a database containing
millions of very large documents. I use Oracle Text Search at the moment, and
things work fine. However, there is a request for faceting capability, and Solr
seems like a technology I should look at. Suffice to say I am
Kevin,
When you say you just included the war you mean the /packs/solr.war correct?
I see that the KStemmer is nicely packed in there but I don't see LucidGaze
anywhere. Have you had any experience using this?
So I'm guessing you would suggest using the LucidWorks solr.war over the
Why do you think you'd hit OOM errors? How big is very large? I've
indexed, as a single document, a 26 volume encyclopedia of civil war
records..
Although as much as I like the technology, if I could get away without using
two technologies, I would. Are you completely sure you can't get what
Hello Experts,
I need help on this issue of mine. I am unsure if this scenario is possible.
I have a field in my solr document named inputxml, the value of which is a
xml string as below. This xml structure is within the inputxml field value. I
needed help on searching this xml structure i.e.
For my purposes, the Porter analyzer was overly aggressive with stemming. So,
we then moved to KStem. It looks like this is no longer being maintained and
Lucid claimed much better performance with theirs, so I gave that a try and it
seems to be working fine. I didn't do any benchmarks though.
I've also index a concatenation of 50k journal articles (making a
single document of several hundred MB of text) and it did not give me
an OOM.
-glen
On 16 March 2010 15:57, Erick Erickson erickerick...@gmail.com wrote:
Why do you think you'd hit OOM errors? How big is very large? I've
I've been trying to bulk index about 11 million PDFs, and while profiling our
Solr instance, I noticed that all of the threads that are processing indexing
requests are constantly blocking each other during this call:
http-8080-Processor39 [BLOCKED] CPU time: 9:35
If you do stay with Oracle, please report back to the list how that went. In
order to get decent filtering and faceting performance, I believe you will need
to use bitmapped indexes which Oracle and some other databases support.
You may want to check out my article on this subject:
Do you have the option of just importing each xml node as a
field/value when you add the document?
That'll let you do the search easily. If you need to store the raw XML,
you can use an extra field.
Tommy Chheng
Programmer and UC Irvine Graduate Student
Twitter @tommychheng
Hey,
I am trying to understand what kind of calculation I should do in order to
come up with reasonable RAM size for a given solr machine.
Suppose the index size is at 16GB.
The Max heap allocated to JVM is about 12GB.
The machine I'm trying now has 24GB.
When the machine is running for a while
Hmm, that is an ugly thing in PDFBox. We should probably take this over to the
PDFBox project. How many threads are you indexing with?
FWIW, for that many documents, I might consider using Tika on the client side
to save on a lot of network traffic.
-Grant
On Mar 16, 2010, at 4:37 PM,
That is a great article, David.
For the moment, I am trying an all-Solr approach, but I have run into a small
problem. The documents are stored as XML CLOB's using Oracle's OPAQUE object.
Is there any facility to unpack this into the actual text? Or must I execute
that in the SQL query?
Originally 16 (the number of CPUs on the machine), but even with 5 threads it's
not looking so hot.
-Original Message-
From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll
Sent: Tuesday, March 16, 2010 5:15 PM
To: solr-user@lucene.apache.org
Subject: Re:
Guys, I think this is an issue with PDFBOX and the version that Tika 0.6
depends on. Tika 0.7-trunk upgraded to PDFBox 1.0.0 (see [1]), so it may
include a fix for the problem you're seeing.
See this discussion [2] on how to patch Tika to use the new PDFBox if you can't
wait for the 0.7
NoClassDefFoundError usually means that the class was found, but it
needs other classes and those were not found. That is, Solr finds the
ExtractingRequestHandler jar but cannot find the Tika jars.
In example/solr/conf/slrconfig.xml, there are several 'lib
dir=path/' elements. These give
They are a namespace like other namespaces and are useable in
attributes, just like in the DB query string examples.
As to defaults, you can declare those in the requestHandler
declarations in solrconfig.xml. Examples of this (search for
defaults) in the wiki page.
On Tue, Mar 16, 2010 at 7:05
I'm pretty unclear on how to patch the Tika 0.7-trunk on our Solr instance.
This is what I've tried so far (which was really just me guessing):
1. Got the latest version of the trunk code from
http://svn.apache.org/repos/asf/lucene/tika/trunk
2. Built this using Maven (mvn install)
Hi guys,
Based on some suggestions, I'm trying to use the dismax query
type. I'm getting a weird error though that I think it related to the
default test data set.
From the query tool (/solr/admin/form.jsp), I put in this:
Statement: artist:test title:test +type:video
query type: dismax
The DataImportHandler has tools for this. It will fetch rows from
Oracle and allow you to unpack columns as XML with Xpaths.
http://wiki.apache.org/solr/DataImportHandler
http://wiki.apache.org/solr/DataImportHandler#Usage_with_RDBMS
Since my original thread was straying to a new topic, I thought it made sense
to create a new thread of discussion.
I am using the DataImportHandler to index 3 fields in a table: an id, a date,
and the text of a document. This is an Oracle database, and the document is an
XML document stored
Lance,
I tried that but no luck. Just in case the relative paths were causing a
problem, I also tried using absolute paths but neither seemed to help.
First, I tried adding *lib dir=/path/to/example/solr/lib /* as the
full directory so it would hopefully include everything. When that
didn't
Disclaimer: My Oracle experience is miniscule at best. I am also a
beginner at Solr, so grab yourself the proverbial grain of salt.
I googled a bit on CLOB. One page I found mentioned setting up a view
to return the data type you want. Can you use the functions described
on these pages in
On Tue, Mar 16, 2010 at 9:08 PM, KaktuChakarabati jimmoe...@gmail.comwrote:
Hey,
I am trying to understand what kind of calculation I should do in order to
come up with reasonable RAM size for a given solr machine.
Suppose the index size is at 16GB.
The Max heap allocated to JVM is about
I suspect your problem is that you still have price defined in
solrconfig.xml for the dismax handler. Look for the section
requestHandler name=dismax..
You'll see price defined as one of the default fields for fl and bf.
HTH
Erick
On Tue, Mar 16, 2010 at 6:55 PM, Alex Thurlow
Besides the other notes here, I agree you'll hit OOM if you try to
read all the rows into memory at once, but I'm absolutely sure you
can read then N at a time instead. Not that I could tell you how, mind
you.
You're on your way...
Erick
On Tue, Mar 16, 2010 at 4:13 PM, Neil Chaudhuri
Aha. That appears to be the issue. I hadn't realized that the query
handler had all of those definitions there.
-Alex
On 3/16/2010 6:56 PM, Erick Erickson wrote:
I suspect your problem is that you still have price defined in
solrconfig.xml for the dismax handler. Look for the section
It seems that Solr's query parser doesn't pass a single term query
to the Analyzer for the field. For example, if I give it
2001年 (year 2001 in Japanese), the searcher returns 0 hits
but if I quote them with double-quotes, it returns hits.
In this experiment, I configured schema.xml so that
the
Hi,
Am using autobench to benchmark solr with the query
http://localhost:8983/solr/select/?q=body:hotel AND
_val_:recip(hsin(0.7113258,-1.291311553,lat_rad,lng_rad,30),1,1,0)^100
But if i specify the same in the autobench command as
autobench --file bar1.tsv --high_rate 100 --low_rate 20
There are certainly a number of widely varying opinions on the use of RAM
directory.
Basically, though, if you need the index to be persistent at some point
(i.e. saved across reboots, crashes etc.),
you'll need to write to a disk, so RAM directory becomes somewhat
superfluous in this case.
I was reading Scaling Lucen and Solr
(http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr/)
and I came across the section StopWords.
In there it mentioned that its not recommended to remove stop words at index
time. Why is this the case? Don't all
[java] INFO: The APR based Apache Tomcat Native library which allows optimal
performance in production environments was not found on the
java.library.path:
.:/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java
What the heck is this and why is it recommended for production
org/apache/solr/util/plugin/SolrCoreAware in the stack trace refers to
an interface in the main Solr jar.
I think this means that putting all of the libs in
apache-tomcat-6.0.20/lib is a mistake: the classloader finds
ExtractingRequestHandler in
Hi all, we translated the Solr tutorial to Spanish due to a client's
request. For all you Spanish speakers/readers out there, you can have a look
at it:
http://www.linebee.com/?p=155
We hope this can expand the usage of the project and lower the language
barrier to non-english speakers.
Thanks
That would be a Tomcat question :)
On Tue, Mar 16, 2010 at 8:36 PM, blargy zman...@hotmail.com wrote:
[java] INFO: The APR based Apache Tomcat Native library which allows optimal
performance in production environments was not found on the
java.library.path:
Use a + sign or %20 for the space. The URL standard uses a plus to mean a space.
On Tue, Mar 16, 2010 at 6:06 PM, KshamaPai kshamapai2...@gmail.com wrote:
Hi,
Am using autobench to benchmark solr with the query
http://localhost:8983/solr/select/?q=body:hotel AND
Hi Giovanni,
Comments below:
I'm pretty unclear on how to patch the Tika 0.7-trunk on our Solr instance.
This is what I've tried so far (which was really just me guessing):
1. Got the latest version of the trunk code from
http://svn.apache.org/repos/asf/lucene/tika/trunk
2.
You need to change your similarity object to be more sensitive at the
short end. This is a patch about how to do this:
http://issues.apache.org/jira/browse/LUCENE-2187
It involves Lucene coding.
On Fri, Mar 12, 2010 at 3:19 AM, muneeb muneeba...@hotmail.com wrote:
Ah I see.
Thanks very much
In solr how can perform AND, OR, NOT search while querying the data
--
View this message in context:
http://old.nabble.com/Issue-in-search-tp27927828p27927828.html
Sent from the Solr - User mailing list archive at Nabble.com.
Just turn your entire disk to RAM
http://www.hyperossystems.co.uk/
800X faster. Who cares if it swaps to 'disk' then :-)
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php
52 matches
Mail list logo