Yes, I consciously let my slaves run away from the master in order to
reduce update latency, but every now and then they sync up with master
that is doing heavy lifting.
The price you pay is that slaves do not see the same documents as the
master, but this is the case anyhow with replication, in
Thanks a lot. We will use the UniqueKey feature and build versioning
ourselves. Do you think it would be a good idea if we built a versioning
feature into Solr/Lucene instead of doing it outside, so that others can
benefit from the feature as well? Guess contributions will be made
according to
Per Steffensen skrev:
Thanks a lot. We will use the UniqueKey feature and build versioning
ourselves. Do you think it would be a good idea if we built a
versioning feature into Solr/Lucene instead of doing it outside, so
that others can benefit from the feature as well? Guess contributions
Sounds much clearer to me than before. :)
Ad-hoc I have two ideas:
First: Let Replication run asynchronously.
If shard1 is pulling the new index from the master and therefore very
recent documents aren't available anymore, shard2 will find them in the
mean-time. As soon as shard1 is up-to-date
I'm new to SOLR and trying to get a proper understanding of what's going on
with fields, facets, and search results.
I've modified the example schema.xml and solrconfig.xml that comes with SOLR
to reflect some fields I want to experiment with. I've also modified the
velocity templates in
Hi,
Am using solr_ruby in ruby code for that am starting solr server by using
start.jsr.
Now i want to write mockobjects for solr connection and code written in my
ruby file to search data from solr.
Can anybody suggest how to do testing without stating solr server
--
View this message in
Check you schema config file first.
It looks like you have missed copy of section_text_content field's content
to your default search field :
defaultSearchFieldtext/defaultSearchField
copyField source=section_text_content dest=text/
--
View this message in context:
Hi,
When i try to index my location field i get this error for each documents :
*ATTENTION: Error creating document Error adding field
'emploi_city_geoloc'='48.85,2.5525' *
(so i have 0 files indexed)
Here is my schema.xml :
*field name=emploi_city_geoloc type=location indexed=true
Hi, all
Thanks for your responses.
I'd tried
[NOW/DAY-30DAY+TO+NOW/DAY-1DAY-1SECOND]
and seems it works fine for me.
Thanks a lot!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Date-filter-query-tp3764349p3766139.html
Sent from the Solr - User mailing list archive at
Dear all,
I wonder how data in HBase is indexed? Now Solr is used in my system
because data is managed in inverted index. Such an index is suitable to
retrieve unstructured and huge amount of data. How does HBase deal with the
issue? May I replaced Solr with HBase?
Thanks so much!
Best regards,
Koji Sekiguchi wrote
(12/02/22 11:58), dhaivat wrote:
Thanks for reply,
But can you please tell me why it's working for some documents and not
for
other.
As Solr 1.4.1 cannot recognize hl.useFastVectorHighlighter flag, Solr just
ignore it, but due to hl=true is there, Solr tries to
You'll *really like* the SolrCloud stuff going into trunk when it's baked
for a while
Best
Erick
On Wed, Feb 22, 2012 at 3:25 AM, eks dev eks...@googlemail.com wrote:
Yes, I consciously let my slaves run away from the master in order to
reduce update latency, but every now and then they
Erick,
You'll *really like* the SolrCloud stuff going into trunk when it's baked
for a while
How stable is SolrCloud at the moment?
I can not wait to try it out.
Kind regards,
Em
Am 22.02.2012 14:45, schrieb Erick Erickson:
You'll *really like* the SolrCloud stuff going into trunk when
Hi Erik,
I have tried links which you given. while runnign rake
am getting error
==
Errno::ECONNREFUSED: No connection could be made because the target machine
acti
vely refused it. - connect(2)
Is anybody aware of any effort regarding porting solr to a netty ( or
any other async-io based framework ) based framework.
Even on medium load ( 10 parallel clients ) with 16 shards
performance seems to deteriorate quite sharply compared another
alternative ( async-io based ) solution as load
I'm not sure what to suggest at this point... obviously your test setup is
trying to hit a Solr server that isn't running. Check the host and port that
it is trying and ensure that Solr is running as your tests expect or use the
mock way that I just replied about.
Note, again, that solr-ruby
Hello,
I wanted to switch to new version of solr, exactelly to 3.5 but im getting
big drop of indexing speed.
I'm using 3.1 and after few tests i discower that 3.4 do it a lot of better
then 3.5
My schema is really simple few field using text type field
/fieldType name=text
It's certainly stable enough to start experimenting with, and I know
that it's under pretty active development now. I've seen a lot
of back-and-forth between Mark Miller and Jamie Johnson,
Jamie trying things and Mark responding.
It's part of the trunk, so be prepared for occasional re-indexing
On Wed, Feb 22, 2012 at 9:27 AM, prasenjit mukherjee
prasen@gmail.com wrote:
Is anybody aware of any effort regarding porting solr to a netty ( or
any other async-io based framework ) based framework.
Even on medium load ( 10 parallel clients ) with 16 shards
performance seems to
Hello,
Would it be unusual for an import of 160 million documents to take 18 hours?
Each document is less than 1kb and I have the DataImportHandler using the jdbc
driver to connect to SQL Server 2008. The full-import query calls a stored
procedure that contains only a select from my target
Import times will depend on:
- hardware (speed of disks, cpu, # of cpus, amount of memory, etc)
- Java configuration (heap size, etc)
- Lucene/Solr configuration (many ...)
- Index configuration - how many fields, indexed how; faceting, etc
- OS configuration (this usually to a lesser degree;
I wanted to switch to new version of solr, exactelly to 3.5
but im getting
big drop of indexing speed.
Could it be autoCommit configuration in solrconfig.xml?
We started observing strange failures from ReplicationHandler when we
commit on master trunk version 4-5 days old.
It works sometimes, and sometimes not didn't dig deeper yet.
Looks like the real culprit hides behind:
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is
Thanks for the response.
Yes we have 16 shards/partitions each on 16 different nodes and a
separate master Solr receiving continuous parallel requests from 10
client threads running on a single separate machine.
Our observation was that the perf degraded non linearly as the load (
no of
Oh sure! As best as I can, anyway.
I have not set the Java heap size, or really configured it at all.
The server running both the SQL Server and Solr has:
* 2 Intel Xeon X5660 (each one is 2.8 GHz, 6 cores, 12 logical processors)
* 64 GB RAM
* One Solr instance (no shards)
I'm not using
Hi darul,
You're right, I was not using defaultSearchField. So, following your
suggestions, I added
defaultSearchFieldtext/defaultSearchField
and
copyField source=section_text_content dest=text/
This required that I add a field text, which is fine. I did that. Now,
when I commit the doc for
As I've mentioned before, I'm very new to Solr. I'm not a Java guy or an
Apache guy. I'm a .Net guy.
We have a rather large schema - some 100 + fields plus a large number of
dynamic fields.
We've been trying to improve performance and finally got around to
implementing fastvectorhighlighting
I'm not sure to understand your solution ?
When (and how) will be the 'word' detection in the fulltext ? before (by my
own) or during (with) solr indexation ?
--
View this message in context:
I'm running into a problem with queries that contain forward slashes and more
than one field.
For example, these queries work fine:
fieldName:/a
fieldName:/*
But if I have two fields with similar syntax in the same query, it fails.
For simplicity, I'm using the same field twice:
fieldName:/a
I changed the heap size (Xmx1582m was as high as I could go). The import is at
about 5% now, and from that I now estimate about 13 hours. It's hard to say
though.. it keeps going up little by little.
If I get approval to use Solr for this project, I'll have them install a 64bit
jvm instead,
Hello,
is there any way to check, if a field of a SolrDocument ist a multivalue
field with java (solrj)?
Greets
Thomas
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-check-if-a-field-is-a-multivalue-field-with-java-tp3767200p3767200.html
Sent from the Solr - User
Mr Gupta,
Thanks so much for your reply!
In my use cases, retrieving data by keyword is one of them. I think Solr is
a proper choice.
However, Solr does not provide a complex enough support to rank. And,
frequent updating is also not suitable in Solr. So it is difficult to
retrieve data
In my first try with the DIH, I had several sub-entities and it was making six
queries per document. My 20M doc load was going to take many hours, most of a
day. I re-wrote it to eliminate those, and now it makes a single query for the
whole load and takes 70 minutes. These are small documents,
If you use the suggested solution, it will detect the words at indexing
time.
However, Solr's FilterFactory's lifecycle keeps no track on whether a
file for synonyms, keywords etc. has been changed since Solr's last startup.
Therefore a change within these files is not visible until you reload
Hi Thomas,
With Java (from within a custom handler in Solr) you can get a handle to the
IndexSchema from the request, like so:
IndexSchema schema = req.getSchema();
SchemaField sf = schema.getField(fielaname);
boolean isMultiValued = sf.multiValued();
From within SolrJ code, you can use
Btw.:
Solr has no downtime while reloading the core.
It loads the new core and while loading the new one it still serves
requests with the old one.
When the new one is ready (and warmed up) it finally replaces the old core.
Best,
Em
Am 22.02.2012 17:56, schrieb Xavier:
I'm not sure to
Yury,
are you sure your request has a proper url-encoding?
Kind regards,
Em
Am 22.02.2012 18:25, schrieb Yury Kats:
I'm running into a problem with queries that contain forward slashes and more
than one field.
For example, these queries work fine:
fieldName:/a
fieldName:/*
But if I
On 2/22/2012 12:25 PM, Yury Kats wrote:
I'm running into a problem with queries that contain forward slashes and more
than one field.
For example, these queries work fine:
fieldName:/a
fieldName:/*
But if I have two fields with similar syntax in the same query, it fails.
For
On 2/22/2012 1:05 PM, Em wrote:
Yury,
are you sure your request has a proper url-encoding?
Yes
Solr does not provide a complex enough support to rank.
I believe Solr has a bunch of plug-ability to write your own custom ranking
approach. If you think you can't do your desired ranking with Solr, you're
probably wrong and need to ask for help from the Solr community.
retrieving data by
2012/2/22 Yury Kats yuryk...@yahoo.com:
On 2/22/2012 12:25 PM, Yury Kats wrote:
I'm running into a problem with queries that contain forward slashes and
more than one field.
For example, these queries work fine:
fieldName:/a
fieldName:/*
But if I have two fields with similar syntax in
Would it be unusual for an import of 160 million documents
to take 18 hours? Each document is less than 1kb and I
have the DataImportHandler using the jdbc driver to connect
to SQL Server 2008. The full-import query calls a stored
procedure that contains only a select from my target table.
That's strange.
Could you provide a sample dataset?
I'd like to try it out.
Kind regards,
Em
Am 22.02.2012 19:17, schrieb Yury Kats:
On 2/22/2012 1:05 PM, Em wrote:
Yury,
are you sure your request has a proper url-encoding?
Yes
On 2/22/2012 1:25 PM, Em wrote:
That's strange.
Could you provide a sample dataset?
Data set does not matter. The query fails to parse, long before it gets to the
data.
On 2/22/2012 1:24 PM, Yonik Seeley wrote:
This is a bit puzzling as the forward slash is not part of the query
language, is it?
Regex queries were added that use forward slashes:
https://issues.apache.org/jira/browse/LUCENE-2604
Oh, so / is a special character now? I don't think it is
Ahmet,
I do not. I commented autoCommit out.
Devon Baumgarten
-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Wednesday, February 22, 2012 12:25 PM
To: solr-user@lucene.apache.org
Subject: Re: Unusually long data import time?
Would it be unusual for an import
Hi,
I am suddenly getting a maxclause count error and don't know why. I am
using Solr 3.5
Hi,
I am suddenly getting a maxClauseCount exception for no reason. I am
using Solr 3.5. I have only 206 documents in my index.
Any ideas? This is wierd.
QUERY PARAMS: [hl, hl.snippets, hl.simple.pre, hl.simple.post, fl,
hl.mergeContiguous, hl.usePhraseHighlighter, hl.requireFieldMatch,
On 2/22/2012 1:24 PM, Yonik Seeley wrote:
Looks like escaping forward slashes makes the query work, eg
fieldName:\/a fieldName:\/a
This is a bit puzzling as the forward slash is not part of the query
language, is it?
Regex queries were added that use forward slashes:
Davon, you ought to try to update from many threads, (I do not know if
DIH can do it, check it), but lucene does great job if fed from many
update threads...
depends where your time gets lost, but it is usually a) analysis chain
or b) database
if it os a) and your server has spare cpu-cores, you
out of curiosity, trying to see if new cloud features can replace what
I use now...
how is this (batch) update forwarding solved at cloud level?
imagine simple one shard and one replica case, if I fire up DIH
update, is this going to be replicated to replica shard?
If yes,
- is it going to be
Well, you probably need to clear you index first..remove index director,
restart your server and try again.
Let me know if it works or not.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Fields-Facets-and-Search-Results-tp3765946p3767537.html
Sent from the Solr - User
And check your log file, you may have some errors at start of your server.
Due to some mistake, bad syntax in your schema file for example...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Fields-Facets-and-Search-Results-tp3765946p3767569.html
Sent from the Solr - User
Hi,
I am getting below error while running delta import and my index is not
updated. Could you please let me know what might be causing this issue? I am
using Solr 3.5 version and around 60+ documents suppose to be updated using
delta import.
[org.apache.solr.handler.dataimport.SolrWriter] -
Hi Uomesh,
I was facing similar issues few days ago and was able to resolve it by
deleting the lock file created in the index directory and restarting my
solr server.
I have documented the same in one of the posts at
http://www.params.me/2011/12/solr-index-lock-issue.html
Hope it helps!
-param
As an update to this... I tried running a query again the
4.0.0.2010.12.10.08.54.56 version and the newer 4.0.0.2012.02.16 (both on
the same box). So the query params were the same, returned results were the
same, but the 4.0.0.2010.12.10.08.54.56 returned the results in about 1.6
seconds and the
i got it all commnented in updateHandler, im prety sure there is no default
autocommit
updateHandler class=solr.DirectUpdateHandler2
iorixxx wrote
I wanted to switch to new version of solr, exactelly to 3.5
but im getting
big drop of indexing speed.
Could it be autoCommit
I am working on upgrading Solr from 1.4 to 3.5, and I have hit a problem. I
have a test checking for a search result in Solr, and the test passes in Solr
1.4, but fails in Solr 3.5. Dismax is the desired QueryParser -- I just
included output from lucene QueryParser to prove the document
Thank you everyone for your patience and suggestions.
It turns out I was doing something really unreasonable in my schema. I
mistakenly edited the max EdgeNgram size to 512, when I meant to set the
lengthFilter max to 512. I brought this to a more reasonable number, and my
estimated time to
thanks for your reply, but don't work.
the same message: can't convert empty path
and more: impossible find class org.apache.nutch.crawl.injector
..
Il giorno 22 febbraio 2012 06:14, tamanjit.bin...@yahoo.co.in
tamanjit.bin...@yahoo.co.in ha scritto:
Try this command.
bin/nutch crawl
Make sure that your schema file is exactly the same on both
your local server and the remote server. Especially there should
be a dynamic field definition like:
dynamicField name=*_coordinate type=tdouble indexed=true stored=false/
and you should see a couple of fields appear like
Hi,
I stumbled across this thread after running into the same question. The
answers presented here seem a little vague and I was hoping to renew the
discussion.
I am using using a branch of Solr 4, distributed searching over 12 shards.
I want the documents in the first shard to always be
hello all,
i need to support the following:
if the user enters sprayer in the desc field - then they get results for
BOTH sprayer and washer.
and in the other direction
if the user enters washer in the desc field - then they get results for
BOTH washer and sprayer.
would i set up my synonym
Hi,
I am getting numerous errors preventing a build of solrcloud trunk.
[licenses] MISSING LICENSE for the following file:
Any tips to get a clean build working?
thanks
Hi dhaivat,
I think you may want to use analysis.jsp:
http://localhost:8983/solr/admin/analysis.jsp
Go to the URL and look into how your custom tokenizer produces tokens,
and compare with the output of Solr's inbuilt tokenizer.
koji
--
Query Log Visualizer for Apache Solr
http://soleami.com/
So I don't really know what I'm talking about, and I'm not really sure
if it's related or not, but your particular query:
The Beatles as musicians : Revolver through the Anthology
With the lone word that's a ':', reminds me of a dismax stopwords-type
problem I ran into. Now, I ran into it on
I forgot to include the field definition information:
schema.xml:
field name=all_search type=text indexed=true stored=false /
solr 3.5:
fieldtype name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer
tokenizer
Two things:
1 what version of Solr are you using? qt=dismax isn't going to any request
handler I don't think.
2 what do you get when you add debugQuery=on? Try that with
both results and perhaps that will shed some light. If not, can you
post the results?
Best
Erick
On Wed, Feb 22, 2012 at 7:47
Jonathan,
I have the same problem without the colon - I tested that, but didn't mention
it.
mm can't be the issue either: in Solr 3.5, if I remove one of the occurrences
of the (doesn't matter which), I get results. Removing any other word does
NOT get results. And if the query isn't
Looks like an issue around replication IndexWriter reboot, soft commits and
hard commits.
I think I've got a workaround for it:
Index: solr/core/src/java/org/apache/solr/handler/SnapPuller.java
===
---
(12/02/22 7:53), Nitin Arora wrote:
Hi,
I'm using SOLR and Lucene in my application for search.
I'm facing an issue of highlighting using FastVectorHighlighter not working
when I use PayloadTermQueries as clauses of a BooleanQuery.
After Debugging I found that In DefaultSolrHighlighter.Java,
The data-config.xml file that I have for indexing database contents has nested
entity nodes within a document node, and each of the entities contains field
nodes. Lucene indexes consist of documents that contain fields. What about
entities? If you change the way entities are structured in a
Could you point me to the most non-intimidating introduction to SolrJ that you
know of? I have a passing familiarity with Javascript and, with few exceptions,
I haven't developing software that has a graphical user interface of any kind
in about 25 years. I like the idea of having finer control
I know everyone is busy, but I was wondering if anyone had found
anything with this? Any suggestions on what I could be doing wrong
would be greatly appreciated.
On Fri, Feb 17, 2012 at 4:08 PM, Mark Miller markrmil...@gmail.com wrote:
On Feb 17, 2012, at 3:56 PM, Jamie Johnson wrote:
id
I am working on indexing the contents of a database that I don't have
permission to alter. In particular, the DataImportHandler examples that show
how to specify a deltaQuery attribute value show database tables that have a
last_modified column, and it compares these values with last_index_time
Yonik did fix an issue around peer sync and deletes a few days ago - long
chance that was involved?
Otherwise, neither Sami nor I have replicated these results so far.
On Feb 22, 2012, at 8:56 PM, Jamie Johnson wrote:
I know everyone is busy, but I was wondering if anyone had found
anything
Perhaps if you could give me the steps you're using to test I can find
an error in what I'm doing.
On Wed, Feb 22, 2012 at 9:24 PM, Mark Miller markrmil...@gmail.com wrote:
Yonik did fix an issue around peer sync and deletes a few days ago - long
chance that was involved?
Otherwise, neither
Jonathan has brought it to my attention that BOTH of my failing searches happen
to have 8 terms, and one of the terms is repeated:
The Beatles as musicians : Revolver through the Anthology
Color-blindness [print/digital]; its dangers and its detection
but this is a PHRASE search.
In case
It *just happens* that I wrote a blog on this very topic, see:
http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
That code contains two rather different methods, one that indexes
based on a SQL database and one based on indexing random files
with client-side Tika.
Best
Erick
On Wed, Feb 22, 2012 at 7:35 PM, Naomi Dushay ndus...@stanford.edu wrote:
Jonathan has brought it to my attention that BOTH of my failing searches
happen to have 8 terms, and one of the terms is repeated:
The Beatles as musicians : Revolver through the Anthology
Color-blindness
I have a dismax request handler with a default fq parameter.
requestHandler name=dismax class=solr.DisMaxRequestHandler
lst name=defaults
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=qf
sku^9.0 upc^9.1 searchKeyword^1.9 series^2.8 productTitle^1.2 productID^9.0
Same question here...
On Wednesday, February 22, 2012, geeky2 gee...@hotmail.com wrote:
hello all,
i need to support the following:
if the user enters sprayer in the desc field - then they get results for
BOTH sprayer and washer.
and in the other direction
if the user enters washer in
Think I answered my own question... I need to use an appends list
--
View this message in context:
http://lucene.472066.n3.nabble.com/default-fq-in-dismax-request-handler-being-overridden-tp3768735p3768817.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi, François Schiettecatte
Thank you for the reply all the same, but I choose to stick on Solr
(wrapped with Tika language API) and do changes outside Solr.
Best Regards,
Bing
--
View this message in context:
Hi all!
I'm using Tika parser to index my files into Solr. I created my own parser
(which extends XMLParser). It uses my own mimetype.
I created a jar file which inside looks like this:
src
|-main
|-some_packages
|-MyParser.java
|resources
|-META-INF
|-services
Use
sprayer, washer
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
Regards
Bernd
Am 23.02.2012 07:03, schrieb remi tassing:
Same question here...
On Wednesday, February 22, 2012, geeky2gee...@hotmail.com wrote:
hello all,
i need to support the
thanks Mark, I will give it a go and report back...
On Thu, Feb 23, 2012 at 1:31 AM, Mark Miller markrmil...@gmail.com wrote:
Looks like an issue around replication IndexWriter reboot, soft commits and
hard commits.
I think I've got a workaround for it:
Index:
Hi, Erick,
The example is impressive. Thank you.
For the first, we decide not to do that, as Tika extraction is
time-consuming part in indexing large files, and the dual call make the
situation worse.
For the second, for now, we choose Dspace to connect to DB, and
discovery(solr) as the
Hello Mike,
Solr is too flat yet. Work is in progress
https://issues.apache.org/jira/browse/SOLR-3076
Good introduction is in Michael's blog
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.htmlbut
it's only about Lucene Queries.
Colleague of my blogged about the same
89 matches
Mail list logo