On Thu, Aug 6, 2009 at 4:10 AM, Robert Petersen rober...@buy.com wrote:
Maintenance Questions: In a two slave one master setup where the two
slaves are behind load balancers what happens if I have to restart solr?
If I have to restart solr say for a schema update where I have added a
new
Hi,
I have a search engine on Solr. Also I have a remote web application which
will be using the Solr Indexes for search.
I have three scenarios:
1) Transfer the Indexes to the Remote Application.
- This will reduce load on the actual solr server and make seraches
faster.
- Need to write
On Thu, Aug 6, 2009 at 12:24 PM, Ninad Rauthbase.user.ni...@gmail.com wrote:
Hi,
I have a search engine on Solr. Also I have a remote web application which
will be using the Solr Indexes for search.
I have three scenarios:
1) Transfer the Indexes to the Remote Application.
- This will
Hi all,
I know how to configure solr.home by using tomcat6, but I don't know how to
set solr.home by using Glassfish(V2.1). I have tried to set the solr.home in
.profile as fellows:
export solr.home=/home/huenzhao/search/solr
export solr/home=/home/huenzhao/search/solr
export
Hi,
I have documents contain word healthcare articles. I need to match the
healthcare artcles documents
for the query strings helath, articles...
I tried q=health*, q=helath*, q=heath*articles but everything returns
empty result. When I try q=healthcare artilces ,the search returns proper
Go through this thread first - http://markmail.org/message/bannl2fpblt5sqlw
If it still does not help, post back your field type definition in
schema.xml
Cheers
Avlesh
On Thu, Aug 6, 2009 at 3:46 PM, Radha C. cra...@ceiindia.com wrote:
Hi,
I have documents contain word healthcare articles.
You have to quote values that include whitespace:
export JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/home/huenzhao/search/solr
or to make it accessible for other paths as well:
export SOLR_HOME=/home/huenzhao/search/solr
export JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=$SOLR_HOME
Cheers,
Chantal
Hi,
Is there a way to import existing Lucene Indexes to SOLR?? I have a huge
lucene index which I want to import into SOLR server.
Regards,
Ninad Raut.
just copy the whole index into data_dir/index and start Solr. That
should just fine
On Thu, Aug 6, 2009 at 5:17 PM, Ninad Rauthbase.user.ni...@gmail.com wrote:
Hi,
Is there a way to import existing Lucene Indexes to SOLR?? I have a huge
lucene index which I want to import into SOLR server.
Your kidding right :)
Noble Paul നോബിള് नोब्ळ् wrote:
just copy the whole index into data_dir/index and start Solr. That
should just fine
On Thu, Aug 6, 2009 at 5:17 PM, Ninad Rauthbase.user.ni...@gmail.com wrote:
Hi,
Is there a way to import existing Lucene Indexes to SOLR?? I have a
what about the schema and querying?? there should be some changes to the
solr schema I think. Correct me if I am wrong.
2009/8/6 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com
just copy the whole index into data_dir/index and start Solr. That
should just fine
On Thu, Aug 6, 2009 at 5:17
I am also interested in knowing! Does it work?
Cheers
Avlesh
On Thu, Aug 6, 2009 at 5:23 PM, Mark Miller markrmil...@gmail.com wrote:
Your kidding right :)
Noble Paul നോബിള് नोब्ळ् wrote:
just copy the whole index into data_dir/index and start Solr. That
should just fine
On Thu, Aug 6,
what about the schema and querying?? there should be some changes to the
solr schema I think. Correct me if I am wrong.
Of course! You have to create your own schema inside the schema.xml and
adjust values inside solrconfig.xml at the bare minimum to get started.
Cheers
Avlesh
On Thu, Aug 6,
yeah the big part was missed . You need to setup a schema.xml matching
the field names and types and you would need a solrconfig.xml . But
getting the schema right would be the challenge.
On Thu, Aug 6, 2009 at 5:23 PM, Mark Millermarkrmil...@gmail.com wrote:
Your kidding right :)
Noble Paul
But getting the schema right would be the challenge.
if I know my fields and there are not many in the lucene index I should not
face any problen creating a schema or are there any pitfalls which I should
be aware off.
Thanks for such quick replies guys.
2009/8/6 Noble Paul നോബിള് नोब्ळ्
if I know my fields and there are not many in the lucene index I should not
face any problen creating a schema or are there any pitfalls which I should
be aware off.
Nothing specific. The creation of schema should be very straightforward.
Just make sure you use the right field types.
Cheers
Hi,
Solr is fine out of RAM if you don't change it (build and then let it cache
what it needs). The RAM is needed when you constantly pepper it with updates
and commits. If you can have the logs update certain shards and then merge
those indexes periodically to machines you can leave alone - this
Hi again,
1.4 runs fine for me, now, but I'm still struggling for the correct
delete query. There is few to no documentation at all for the new
special commands, and I have problems guessing the correct setup from
reading through the code. SORL-1060 is not enough help.
I've come up with a
Hi ,
In my application I am trying to search with some special characters like , $ #
, sloar returning all the search results available . Some of the charcters
like _ . are not encoding in the search url .can any one have any idea , what
would be the root cause of this .
I am using jetty
On Thu, Aug 6, 2009 at 6:41 PM, Chantal
Ackermannchantal.ackerm...@btelligent.de wrote:
Hi again,
1.4 runs fine for me, now, but I'm still struggling for the correct delete
query. There is few to no documentation at all for the new special commands,
and I have problems guessing the correct
Great! *bow*
Thanks,
Chantal
entity name=delete_from_index pk=GROUPID transformer=TemplateTransformer
query=select GROUPID from DEFINITION
where LANGUAGE='de'
and CHANGED_DATE '${dataimporter.last_index_time}'
field column=$deleteDocByQuery
I'm guessing it is because you have your Spell checker mapped to the
spellchecker request handler, but you are asking the standard
request handler to build the spell checker. Unless you've modified
the Standard Req Handler, it is not spell check aware.
Try
Hi everyone,
I'm indexing several documents that contain words that the StandardTokenizer
cannot detect as tokens. These are words like
C#
.NET
C++
which are important for users to be able to search for, but get treated as
C, NET, and C.
How can I create a list of words that should be
You are right. Replication was disabled after the server was restarted, and
then I saw the behavior. After I added some data, command indexversion
returns the right values. So it seems Solr behaved correctly.
Thanks,
2009/8/5 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com
how is the
You could create a new working core, then call the swap command once
it is ready. Then remove the work core and delete the appropriate index
folder at your convenience.
-Original Message-
From: Robert Petersen [mailto:rober...@buy.com]
Sent: Wednesday, August 05, 2009 6:41 PM
To:
Something similar has been discussed earlier. Go through this thread -
http://www.lucidimagination.com/search/document/b5977650557f50cb/problem_with_query_parser
PS: Solr is pronounced as Solar but written without the a.
Cheers
Avlesh
On Thu, Aug 6, 2009 at 7:18 PM, Deepak VSVK
Here is another idea. With solr multicore you can dynamically spin up
extra cores and bring them online. I'm not sure how well this would
work for us since we have hard coded the names of the cores we are
hitting in our config files.
-Original Message-
From: Brian Klippel
Design so that you can handle the load with one server down (N+1
sizing), then take one server out for any maintenance. Simple and
works fine.
wunder
On Aug 6, 2009, at 9:25 AM, Robert Petersen wrote:
Here is another idea. With solr multicore you can dynamically spin up
extra cores and
Hi all,
to keep this thread up to date... ;-)
d) jdbc batch size
changed to 10. (Was default: 500, then 1000)
The problem with my dih setup is that the root entity query returns a
huge set (all ids that shall be indexed). A larger fetchsize would be
good for that query.
The nested entity,
On Mon, Aug 3, 2009 at 12:32 PM, Chantal
Ackermannchantal.ackerm...@btelligent.de wrote:
avg-cpu: %user %nice %sys %iowait %idle
1.23 0.00 0.03 0.03 98.71
Basically, it is doing very little? *scratch*
How often is commit being called? (a Lucene commit sync's all
Hi, would really appreciate some help on this.
I'm doing a category browser for companies. Kind of like a yellow pages.
For each company I store each category the company is in like this:
Example for Boeing would be
03.03.02
which is an fictional id for 'Jets'
The beginning point I display all
does DIH call commit periodically, or are things done in one big batch?
AFAIK, one big batch.
Cheers
Avlesh
On Thu, Aug 6, 2009 at 11:23 PM, Yonik Seeley yo...@lucidimagination.comwrote:
On Mon, Aug 3, 2009 at 12:32 PM, Chantal
Ackermannchantal.ackerm...@btelligent.de wrote:
avg-cpu:
I'm investigating a problem I bet some of you have hit before, and exploring
several options to address it. I suspect that this specific IDF scenario is
common enough that it even has a name, though I'm not what it would be
called.
The scenario:
Suppose you have a search application focused on
As soon as I started reading your message I started thinking common
grams, so that is what I would try first, esp. since somebody already
did the work of porting that from Nutch to Solr (see Solr JIRA).
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch,
Hey there,
We're trying to add foreign language support into our new search
engine -- languages like Arabic, Farsi, and Urdu (that don't work with
standard analyzers). But our data source doesn't tell us which
languages we're actually collecting -- we just get blocks of text. Has
anyone here
Bradford, there is an arabic analyzer in trunk. for farsi there is
currently a patch available:
http://issues.apache.org/jira/browse/LUCENE-1628
one option is not to detect languages at all.
it could be hard for short queries due to the languages you mentioned
borrowing from each other.
but you
Hi...
Is there any way to group values like shopping.yahoo.com or shopper.cnet.com do?
For instance, I have documents like:
doc1 - product_name1 - value1
doc2 - product_name1 - value2
doc3 - product_name1 - value3
doc4 - product_name2 - value4
doc5 - product_name2 - value5
doc6 - product_name2
Hello,
I think you are confusing the size of the data you want to index with the
size of the index. For our indexes (large full text documents) the Solr
index is about 1/3 of the size of the documents being indexed. For 3 TB of
data you might have an index of 1 TB or less. This depends on
for first time loads i currently post to
/update/csv?commit=falseseparator=%09escape=\stream.file=workfile.txtmap=NULL:keepEmpty=false,
this works well and finishes in about 20 minutes for my work load.
this is mostly cpu bound, i have an 8 core box and it seems one takes
the brunt of the work.
If you can reindex, simply rebuild the index with fields replaced by
combining existing fields.
-Yao
-Original Message-
From: David Lojudice Sobrinho [mailto:dalss...@gmail.com]
Sent: Thursday, August 06, 2009 4:17 PM
To: solr-user@lucene.apache.org
Subject: Item Facet
Hi...
Is there
You should stand to benefit from concurrent loading. Certainly the text
analysis would end up being done concurrently; I'm not sure what else benefits
from it but I think there are other things. Ideally you could try a
configurable number of concurrent loads and pick the one that gets the job
Did a bit more creative searching for a solution and came up with this:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg15027.html
I'm using couple of days old nightly build, so unless there is
something new I should know about I'm going with that method :)
2009/8/6 Jón Helgi Jónsson
By the way, I was using command=indexversion to verify replication is on or
off. Since it seems not reliable, is there a better to do it?
Thanks,
On Thu, Aug 6, 2009 at 8:43 AM, solr jay solr...@gmail.com wrote:
You are right. Replication was disabled after the server was restarted, and
then
I can't reindex because the aggregated/grouped result should change as
the query changes... in other words, the result must by dynamic
We've been thinking about a new handler for it something like:
/select?q=laptoprows=0itemfacet=onitemfacet.field=product_name,min(price),max(price)
Does it
Is that 'blocks of text' is a (unicode) Java string? I don't think
this is the case, but then, use Character.UnicodeBlock to identify the
language of the text.
And, is that just text files with unknown character encoding? Then ICU
has a 'charset detector' that you can use. This feature 'suggests'
fyi, you can use the block property,but I think even better is to use
the unicode script property: http://unicode.org/reports/tr24/ . This
is easier because some characters are common across different scripts.
Also, some scripts span multiple unicode blocks.
This is the direction I was heading
Google Translate just released (last week) its language API with translation
and LANGUAGE DETECTION.
:)
It's very simple to use, and you can query it with some text to define witch
language is it.
Here is a simple example using groovy, but all you need is the url to
query:
There is a patch for it:
https://issues.apache.org/jira/browse/SOLR-64
Koji
Jón Helgi Jónsson wrote:
Did a bit more creative searching for a solution and came up with this:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg15027.html
I'm using couple of days old nightly build, so
I'm using SolrJ. When I attempt to set up a query to retrieve the maximum id
in the index, I'm getting an exception.
My setup code is:
final SolrQuery params = new SolrQuery();
params.addSortField(id, ORDER.desc);
params.setRows(1);
params.setQuery(queryString);
I have tried, but it was also not work!
The goal to set solr.home in tomcat6 is to start solr when the tomcat6 is
starting.
So I think the problem is that the solr can not start by set the solr.home
when glassfish is starting.
Chantal Ackermann wrote:
You have to quote values that
Dynamic fields might be an answer. If you had a field called product_* and
these were populated with the corresponding values during indexing then
faceting on these fields will give you the desired behavior.
The only catch here is that the product names have to be known upfront. A
wildcard
It looks like you export JAVA_OPTS in your .profile, but I bet Tomcat also sets
and thus overrides this same JAVA_OPTS it its own start up script. So that is
what you should edit and modify. I'm a Jetty user, so I don't have a Tomcat
startup script to check for you.
Otis
--
Sematext is
Chris Hostetter wrote:
: I need to tokenize my field on whitespaces, html, punctuation, apostrophe
: but if I use HTMLStripStandardTokenizerFactory it strips only html
: but no apostrophes
you might consider using one of the HTML Tokenizers, and then use a
PatternReplaceFilterFilter ...
Hi Noble,
Can you explain a bit more on how to use Solr out of the box. I am
looking at ways to design the UI for remote application quickly and with
less problems.
Also could you elaborate more on what can go wrong with the first option?
Thanks.
2009/8/6 Noble Paul നോബിള് नोब्ळ्
Bradford,
If I may:
Have a look at http://www.sematext.com/products/language-identifier/index.html
And/or http://www.sematext.com/products/multilingual-indexer/index.html
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP,
params.setQuery(queryString);
The query string is *:*, right?
Your id field is sortable, right?
Cheers
Avlesh
On Fri, Aug 7, 2009 at 5:58 AM, Reuben Firmin reub...@benetech.org wrote:
I'm using SolrJ. When I attempt to set up a query to retrieve the maximum
id
in the index, I'm getting
About the first option, caches are more effective with more traffic,
so ten front end servers using three Solr servers will have better
caching and probably better overall performance than having separate
search on all ten servers. You can even put an HTTP cache in there and
get better
The remote web app will be accessing the Solr server via internet. Its not a
intranet setup.
On Fri, Aug 7, 2009 at 10:19 AM, Walter Underwood wun...@wunderwood.orgwrote:
About the first option, caches are more effective with more traffic, so ten
front end servers using three Solr servers will
The you should consider replicating the index to the local intranet
and still run that it as a separate app.
On Fri, Aug 7, 2009 at 10:53 AM, Ninad Rauthbase.user.ni...@gmail.com wrote:
The remote web app will be accessing the Solr server via internet. Its not a
intranet setup.
On Fri, Aug 7,
The you should consider replicating the index to the local intranet
and still run that it as a separate app.
Will it be the same master-slave replication?? If the master is
multicore, can I specifically replicate an index of a certain core ? Thanks
for the help.
2009/8/7 Noble Paul നോബിള്
Have you tried setting solr home via the JNDI? I think you can set it via
solr/home but that would require adding this to your servlet context
configuration.
Another option is to trace the startup scripts for Glassfish and see what
environment variables are passed in. JAVA_OPTS would make sense
All,
An off and on project of mine has been to work on refactoring the way we
load data from MySQL into Solr. Our current approach is fairly hard coded
and not configurable as I would like. I was curious of people who have used
the DIH and/or LuSQL to load data into Solr, how much data you
62 matches
Mail list logo