Hi,
I have been using external file field (eff) for holding rank of the document
which gets updated every day based on different stats collected by the
system. Once the rank is computed the new files are pushed to Master which
will eventually replicate to slaves on next commit.
Our eff file has
+1
:D
--
View this message in context:
http://lucene.472066.n3.nabble.com/Date-for-4-4-solr-release-tp4079152p4079254.html
Sent from the Solr - User mailing list archive at Nabble.com.
Walter,
Could you provide some more details about your staggered replication
approach?
We are currently running into similar issues and looks like staggered
replication is a better approach to address the performance issues on
Slaves.
thanks
Aditya
--
View this message in context:
Thanks Shawn,
We do have repeaters setup to replicate index to the 8 Slaves.
We update documents to Master every 2hrs in a batch process. When on hard
commit is replicated to repeaters and then to slaves.
The concern is that during heavy traffic when slaves are busy serving
request, when a new
I have seen this in 4.2.1 too.
Once replication is finished, on Admin UI we see 100% and time and dlspeed
information goes out of wack Same is reflected in mbeans. But whats actually
happening in the background is auto-warmup of caches (in my case)
May be some minor stats bug
--
View this
Hi,
Is staggered replication possible in Solr through configuration?
We are concern with the CPU spike (80%) and GC pauses on all the slaves when
they try to replicate updated index from repeaters. We havent observed this
behavior in v3.5 (Max spike were 50% during replication)
In our case we
For reference
https://issues.apache.org/jira/browse/SOLR-5019
--
View this message in context:
http://lucene.472066.n3.nabble.com/Concurrent-Modification-Exception-tp4074371p4076330.html
Sent from the Solr - User mailing list archive at Nabble.com.
Nope no custom plugin. Just use the DirectSpellCheck component.
We have raised a ticket with LucidWorks i will followup with that and once
have a JIRA will update this post.
--
View this message in context:
well our observation leads us that this happens only during spell check.
If we turn off the spell check we don't see this issue occurring at all from
our 24hrs test run.
We have Jboss5.1 in production running Solr 4.2.1 (without spellcheck) no
issues at all.
Aditya
--
View this message in
Anyone , any suggestion or pointers for this issue?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Concurrent-Modification-Exception-tp4074371p4074829.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I have recently upgraded from Solr 3.5 to 4.2.1.
Also we have added spellcheck feature to our search query. During our
performance testing we have observed that for every 2000 request, 1 request
fails.
The exception we observe in solr log are ConcurrentModificationException.
Below is the
I see that your query has boost value so this mean you need Solr to Score on
each match document.
One of the key difference between q and fq is thats fq will not have any
impact on score. where as having it in q will score each document based on
the Similarity Score.
--
View this message in
+1
q and fq both can be cached.
--
View this message in context:
http://lucene.472066.n3.nabble.com/fq-vs-q-parameter-tp4071748p4071759.html
Sent from the Solr - User mailing list archive at Nabble.com.
It was interesting to read this post. I had similar issue on Solr v4.2.1. The
nature of our document is that it has huge multiValued fields and we were
able to knock off out server in about 30muns
We then found a bug Lucene-4995 which was causing all the problem.
Applying the patch has helped a
Thanks for the explanation Steve. I now see it clearly. In my case it should
work.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070805.html
Sent from the Solr - User mailing list archive at Nabble.com.
Just to confirm even solr.ASCIIFoldingFilterFactory should solve the
purpose.
am i correct ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070317.html
Sent from the Solr - User mailing list archive at Nabble.com.
this might be a dumb question. But can you please point me some key
difference between ASCIIFolding Filter and Character Filter using a map
File.
thanks
Aditya
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070398.html
Sent from the
Hi All,
I work with Sandeep M, so continued to his comments. We did observe a memory
growth.
We use jdk1.6.0_45 with CMS. We see this issue because of large document
size. With large i mean our single document has large multivalued fields.
We found that JIRA LUCENE-4995
Did you try reducing filter and query cache. They are fairly large too unless
you really need them to be cached for your use cache.
Do you have that many distinct filter queries hitting solr for the size you
have defined for filterCache?
Are you doing any sorting? as this will chew up a lot of
Hi,
We recently had production release to upgrade our Solr3.5 to Solr 4.2.1. (No
schema change except the some basic required for 4.2.1)
The nature of our document is that we have huge multivalued fields. they can
go from 1000 to 100K in once single field.
# Documents : 300K
# Index size: 9GB
thanks for our reply Chris,
Yes i am aware of this Bug. we had reported this through lucid work during
our 4.2.0 evaluation :)
I will try to get thread dump and verify where CPU is pegging
Regarding tall documents. We have a huge list of multivalued in the
document. which i refer as tall
These numbers are really great. Would you mind sharing your h/w configuration
and JVM params
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Upgrading-from-SOLR-3-5-to-4-2-1-Results-tp4064266p4064370.html
Sent from the Solr - User mailing list archive at
Hi,
We recently decided to move from Solr version 3.5 to 4.2.1. The transition
seam to be smooth from development point but i see some intermediate issues
with our cluster.
Some information We use the classic Master/Slave model (have plans to move
to Cloud v4.3)
#documents 300K and have around
Based on the fq (- in it) you posted are you trying to filter out all the
offline users?
Other option do you need the complete list in one request? did you try
splitting them in to batches of say 100 ids in one solr query.
--
View this message in context:
Hi Ali,
We have Solr 4.2 on Jboss running on a separate VM behind firewall. Only IT
Administration and our FrontEnd Application Server is able to access the
Solr servers in production.
--
View this message in context:
but i see two different version numbers
Under commit:
long name=indexVersion1364655616211/long
long name=generation22/long
and under Master/Slave tags
long name=indexVersion1364619609805/long
long name=generation20/long
Can this be a case anytime on Master ?
--
View this message in
Mark, isn't this response from master some what confusing. The gen and
version number is out of sync.
http://.../solr/replication?command=details
response
lst name=responseHeader
int name=status0/int
int name=QTime2/int
/lst
lst name=details
str name=indexSize1.52 GB/str
str
Hi Joel,
Might have an answer for this. Initially my servers were on 3.5 and then i
moved to Solr 4.0. at this time i use the solrconfig.xml that was in the
example and updated is with parameters i changed in 3.5 for the environment.
there was no codecFactory class=solr.SchemaCodecFactory/ in the
+1
I have observed this same issue no change on master and slave is bumped up
with higher index number.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-4-2-Slave-Index-version-is-higher-than-Master-tp4049827p4052445.html
Sent from the Solr - User mailing list
Something is really wrong with replication.
Check the document attached which has the screen shot.
I - re-indexed the master after adding new fields to schema file (its part
of config file replication)
The UI shows master as gen '6' where as in slaves log the Master gen is '7'
The attached
Joel, thanks for your excellent idea using docValues. its working exactly as
you described.
So far my unit test case has no issues and i see low memory foot print. Will
be sending the build for performance that should give comparable numbers.
Now i see another replication issue in 4.2. there is
@Mark attached are the full logs from both master and slave. Hope this might
be some help.
console_master.log
http://lucene.472066.n3.nabble.com/file/n4052516/console_master.log
console_slave.log
http://lucene.472066.n3.nabble.com/file/n4052516/console_slave.log
Ignore the mbeans call in
Here is the field type definition. same as what you posted yesterday just a
different name.
fieldType name=dvLong class=solr.TrieLongField precisionStep=0
docValuesFormat=Disk positionIncrementGap=0/
And Field Definition
field name=lcontNumOfDownloads type=dvLong indexed=true stored=true
still no luck
Performed.
1. Stop the Application Server (JBoss)
2. Deleted everything under data
3. Star the server
4. Observe exception in log (i have uploaded the file)
on a side note. do i need to have any additional jar files in the solr home
lib folder. currently its empty.
Update ---
I was able to fix the exception by adding following line in solrconfig.xml
codecFactory name=CodecFactory class=solr.SchemaCodecFactory /
Not sure if its mentioned in any document to have this declared in config
file.
I am now re-indexing and data on the master and will perform test
Wo. that's strange.
I tried toggling with the code factory line in solrconfig.xml (attached in
this post)
commenting gives me error where as un-commenting works.
can you please take a look into config and let me know if anything wrong
there?
thanks
Aditya
solrconfig.xml
Hi Joel,
you are correct, boost function populates the field cache. Well i am not
aware of docValue, so while trying the example you provided i see the error
when i define the field type
Caused by: org.apache.solr.common.SolrException: FieldType 'dvLong' is
configured with a docValues format,
thanks Eric. in this query q=*:* the Lucene score is always 1
--
View this message in context:
http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139p4050944.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
there were couple of major issues with replication in 4.1, we experienced
the same and were suggested to move to 4.2 which has fix for this. I suggest
you upgrade to 4.2
--
View this message in context:
Hi All,
I want to validate my approach by the experts, just to make sure i am on
doing anything wrong.
#Docs in Solr : 25M
Solr Versin : 4.2
Our requirement is to list top download document based on user country.
So we have a dynamic field *numdownload.** which is evaluate as
Hi,
As per one of our search requirement for searching on title. We have
implemented as below which servers us quite good.
Title : iTunes Sync
Analyzer on this field is
WhitespaceTokenizerFactory
WordDelimiterFilterFactory {generateNumberParts=1, catenateWords=1,
generateWordParts=1,
thanks Eric,
Well the index was build from scratch with 4.1. Our IT Engineer was able to
take some CPU sampler and on analyzing it he mentioned that
While running I noticed that the method:
java.util.AbstractList$ltr.hasNext() (called by
org.apache.lucene.document.Document.getFields() ) taking a
Thanks guys ..
Well i did another test. Copied the Index files from perf lab to Dev machine
which has Solr4.1
Now ran solrmeter to generate load on Dev server. We were able to drive the
QPS upto 150 with CPU on avg 35%. but the same index is generating 100% CPU
at 1 QPS in perf lab.
On a side
whats the life cycle of a tlog file. Is it purged after commit (even with
soft commit) ?
I posted 100 docs to solr (standalone) did hard commit. Observed a new tlog
file is created.
re-posted the same 100 docs and did hard commit. Observed a new tlog file is
created. Old one still exists.
When
thanks Shawn,
I did try with specifying a fixed set of fl and with no score none gave any
better performance.
We have a different VM with same index and Solr4.1 on Jboss 5.1 which does
perfectly fine with all the queries. So this is confusing us a bit more.
Have our VM expert to look now
Logging screen seam to be broken on Solr 4.1 .. any ideas ?
http://lucene.472066.n3.nabble.com/file/n4043787/logging_level_4.1.jpg
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr4-1-Loggin-Level-Screen-just-shows-root-tp4042556p4043787.html
Sent from the Solr - User
@Hung,
Did you rename the index directory on master or on Slave?
I tried this but didn't work for me. :( with every commit the slave creates
a new index.x directory and download the complete index from master.
My setup is JBoss7.1.1 with Solr 4.1
thanks
Aditya
--
View this message
Hi,
A little history before i tell the actual issue. Please bare with me.
In Dev lab with a single VM (2vCPU, 4gb RAM, 3Gb to JVM, JBoss5.1,
JDK1.6.0.30) we used Solr 4.1 and indexed 250K documents with each document
of avg size 22Kb - Totla index size is 3.6GB.
Every thing works good. The only
Hi we have just upgraded our Dev lab with Solr 4.1 with no Cloud. so our
implementation is like
Master - Repeater - Slaves(2). In production we have large cluster so there
will be 8 slaves.
What observed is
1. Slave 1 replicating index from master show correct version number
2. Slave 2
thanks for the update Mark. looking fwd for this fix. Will try the trunk 4.x
Is it safe to assume that even with the incorrect version numbers the data
on slave (replicated from Repeater) is same as whats on master ? At-least
the files what i see in index dir are same.
As per our schedule we
Hi,
Have been using Solr 3.5 for a while and now upgraded to Solr4.1
Everything works as expected except the LoggingLevel Screen. All i see is
just the root, no other packages are displayed.
I am running Solr 4.1 on Jboss 7.1 and there are no additional jars in the
SOLR_HOME/collections1/lib.
thanks Erick,
looks like i need to generate some part of the query in Application layer to
handle this.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Dynamic-field-searching-with-edismax-tp4041461p4041551.html
Sent from the Solr - User mailing list archive at
We have a search query which we want to configure in Solrconfig.xml and lock
down in such a way that user will just pass the keyword e.g
q=facebookqf=title.0defType=edismax
We have a dynamic field title.* for each language. During search the * is
replaced with login user language and hence need
thanks Eric,
is this what you are pointing me to ?
http://.../solr/select?q=if(exist(title.3),(title.3:xyz),(title.0:xyz))
I believe i should be able to use boost along with proximity too.
--
View this message in context:
Hi,
Trying to find a better approach for searching keywords.
We have indexed about 100K documents indexed in Solr 3.5 and each doc has
field title for different country
Field title is dynamic defined as title.* about 20 countries .
its not necessary that each document will have title for all 20
Hi,
We currently have Solr3.5 and working well. With the features and fixes
available in 4.1 we decided to upgrade.
We started some test with Solr4.1 on Jbos7.1. Everything looks good at first
run and indexing and execute some queries. We restart servers before
performing Load test and
thanks Jack,
I did take the latest solrconfig.xml file.
The only change i made to the file is for using MMapDirectory
directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.MMapDirectoryFactory}/
Apart from that i increased the cache size for
Thanks for additional information Shawn,
I am testing 4.1 on single machine sing core. so no cloud. I did change
NRTCachingDirectoryFactory to MMapDirectoryFactory and after indexing all
the document we do a hard commit explicitly from our publisher client.
I was able to run queries to verify my
Shawn,
i believe your point is valid ... if you see below my tlog.* file size is
huge. Bu shouldn't that be cleared if i am not using soft commit and do an
explicit hard commit?
After deleting this i was able to get my server up. Thanks for the
information/help. Also pleae let me know how to
thanks Shawn,
We use Master-Slave architecture in Prod and planning to continue even with
4.1.
Our indexing usually happens on Master. and we about 10K docs every 2hrs and
then perform commit.
Our full re-index is only when we have schema change. So we dont use auto
commit.
Is there a way to
thanks Shawn,
I will try both the approach.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Server-stops-at-Opening-Searcher-in-4-1-tp4037458p4037499.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks Paul,
In our case we need to upgrade DB with new version before re-building the
index as we pull data from DB.
Once DB is upgraded, UI needs to be upgraded too.
We do the same in case when there is no DB schema changes. We disable the
replication and build index on Master. Once index
Our main issue is for new index we need to update DB first as data needs to
be pulled from DB as per new schema and publish to Solr.
So DB update has to be first step.
--
View this message in context:
Hi Erick,
I have upgraded from 3.5 to 4.0 on first index build i was able to see the
distinct terms on UI under Schema Browser section. I had to add a new date
field as stored to schema and re-index.
After that all the time on UI for every field i see the distinct term as
-1.
Can you please
Hi
Deployed 4.0 and while investigating the schema Browser for seeing the
unique term count for each field observed following error. The top term
shows 10/-1. its -1 all the time. Any idea what might be wrong.
thanks
Aditya
2012-10-28 10:48:42,017 SEVERE
Hi Vijith,
See if this solution solves your problem. There might be other ways this is
the one i have on top of my mind at this hour.
You might be having and ID for each qualification. then have the relation
using dotted notation.
1 = MBA, 2 = LEAD etc.
arr name=grade
str1.A/str
str2.B/str
Are you applying any analyzer/tokenizer for the fieldType 'string' (i guess
no)
your query in the response shows '*dell*' where as you are store data is
'*Dell*'.
If you wan to search ignoring the case then you might need to use
LowerCaseFilterFactory as analyzer to the field. and then perform
Can you please share some information on Setting up Solr 4.0 as a singleCore.
I tried doing it and keep seeing ClassNotFound Exception for
KeywordTokenizerFactory. on server start up.
I see the jar files being loaded in the logs but its unable to find the
class.
Can you let me know what jars
68 matches
Mail list logo