Maybe if I were to say that the column user_id will become user_ids
that would clarify things?
user_id:2002+AND+created:[${**from}+TO+${until}]+data:more
becomes
user_id*s*:2002+AND+created:[${**from}+TO+${until}]+data:more
where I want 2002 to be an exact positive match on one of the user_ids
Hi Shawn,
I also had CMS with tons of tuning options but still had once in a while
bigger GC pause. After switching to JDK7 I tried G1GC with no other options
and it runs perfekt.
With CMS I saw that old and young generation where growing until they
had to do a GC. This produces the sawtooth and
On Fri, 2013-06-07 at 07:15 +0200, Andy wrote:
One question I have is did you precondition the SSD (
http://www.sandforce.com/userfiles/file/downloads/FMS2009_F2A_Smith.pdf )?
SSD performance tends to take a very deep dive once all blocks are written at
least once and the garbage collector
I had a similar error. I couldn't find any documentation which nutch and
solr versions are compatible. For instance, we' re using nutch 1.6 on
hadoop 1.0.4 with solrj 3.4.0 and index crawled segments to solr 4.2.0. But
I remember that I could find a compatible version of solrj for nutch 1.4
Hi
I 'm trying to compare the performance of different Solr queries. In order
to get a fair test, I want to clear the cache between queries.
How is this done? Of course, one can restart the server, I was to know if
there is a quicker way.
--
View this message in context:
Hi ,
Need more information how NoOpDistributingUpdateProcessorFactory works,
Below is the cloud setup,
collection1 shard1 ---node1:8983 (leader)
| | _ _ _ _ _ _ _ _ _ _ node2:8984
|
|_ _ _ _ _ _ _ _ _ _ _ _ shard2--- node3:7585
Hi,
we were able to accomplish this by single collection.
Zookeeper :
create separate node for each shards, and upload the dbconfig file under
shards.
eg : /config/config1/shard1
/config/config1/shard2
/config/config1/shard3
In the solrconfig.xml,
requestHandler name=/dataimport
Hi,
we were able to accomplish this by single collection.
Zookeeper :
create separate node for each shards, and upload the dbconfig file under
shards.
eg : /config/config1/shard1
/config/config1/shard2
/config/config1/shard3
In the solrconfig.xml,
requestHandler name=/dataimport
Hi,
we were able to accomplish this by single collection.
Zookeeper :
create separate node for each shards, and upload the dbconfig file under
shards.
eg : /config/config1/shard1
/config/config1/shard2
/config/config1/shard3
In the solrconfig.xml,
requestHandler name=/dataimport
On Fri, 2013-06-07 at 09:24 +0200, Varsha Rani wrote:
I 'm trying to compare the performance of different Solr queries. In order
to get a fair test, I want to clear the cache between queries.
How is this done? Of course, one can restart the server, I was to know if
there is a quicker way.
A use case would a web site or service that had millions of users, each of
whom would have an active Solr core when they are active, but inactive
otherwise. Of course those cores would not all reside on one node and
ZooKeeper is out of the question for managing anything that is in the
I have a SolrCloud and I want to maintain some important things on it. i.e.
I will backup indexes, start - stop Solr nodes individually, send an
optimize request to the cloud etc. However I see that there is a scripts
folder comes with Solr. Can I use some of them for my purposes or should I
Hi,
I have two shards, logically each shards corresponds to a region. Currently
index is distributed in solr cloud to shards, how to load index to specific
shard in solr cloud,
Any thoughts ?
Thanks,
Sathish
--
View this message in context:
Guys,
Please clarify the following questions regarding Solr Internationalization.
1) Initially my requirement is need to support 2 languages(English French)
for a Web application.
And we are using Mysql DB.
2) So please share good and easy approach to achieve it with some sample
configs.
3)
The Wiki page was built not for Cloud Solr.
We have done such a deployment where less than a tenth of cores were active
at any given point in time. though there were tens of million indices they
were split among a large no:of hosts.
If you don't insist of Cloud deployment it is possible. I'm
Have you tried explicitly giving the field names (fl) as parameter
http://wiki.apache.org/solr/CommonQueryParameters#fl
On Thu, Jun 6, 2013 at 12:41 PM, anurag.jain anurag.k...@gmail.com wrote:
I want output of csv file in proper order. when I use wt=csv it gives
output in random order. Is
Thanks for this, hard data is always welcome!
Another blog post for my reference list!
Erick
On Fri, Jun 7, 2013 at 2:59 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote:
On Fri, 2013-06-07 at 07:15 +0200, Andy wrote:
One question I have is did you precondition the SSD (
I don't think you want the noop bits, I'd go back to the
standard definitions here.
What you _do_ want, I think, is the custom hashing option, see:
https://issues.apache.org/jira/browse/SOLR-2592
which has been in place since Solr 4.1. It allows you to
send documents to the shard of your choice,
I really question whether this is valuable. Much of Solr performance
is there explicitly because of caches, so what you're measuring
is disk I/O to fill caches and any other latency. I'm just not sure
what operational information you'll get here.
But assuming that you're really getting actionable
Hello All,
I required facet counts for multiple SearchTerms.
Currently I am doing two separate facet query on each search term with
facet.range=dateField
e.g.
http://solrserver/select?q=1stsearchTermfq=onfacet-parameters
I should have been clearer, and others have mentioned... the lots of cores
stuff is really outside Zookeeper/SolrCloud at present. I don't think it's
incompatible, but it wasn't part of the design so it'll need some effort to
make it play nice with SolrCloud. I'm not sure there's actually a
Good morning,
I would like to know how I can modify a xml file to access to my information
and not to the example information because I have one file from I obtains the
information that I use to show the user with Blacklight.
Sorry about my english,
Alex
hi,
you need to parse your custom xml file and transform it into the xml file
that will be of format solr understands. If you are familiar with xslt, you
could do that in a few lines depending on the complexity of the input xml
file.
Dmitry
On Fri, Jun 7, 2013 at 3:34 PM, acas...@greendata.com
Hi ,
How did you distribute the index by year to different shards,
do we need to write any code ?
Thanks,
Sathish
--
View this message in context:
http://lucene.472066.n3.nabble.com/Doubt-Regarding-Shards-Index-tp3629964p4068869.html
Sent from the Solr - User mailing list archive at
CROSS-POSTING from dev list.
Hi guys,
As discussed with Grant and Andrzej I have created two jiras related to
inefficiency in distributed faceting. This affects 3.4, but my gut feeling
is telling me 4.x is affected as well.
Regards,
Dmitry Kan
P.S. Asking this question won yours truly second
Eagle eye man.
Yeah, we plan on contributing hdfs support for Solr. I'm flying home today and
will create a JIRA issue for it shortly after I get there.
- Mark
On Jun 6, 2013, at 6:16 PM, Jamie Johnson jej2...@gmail.com wrote:
I've seen reference to an HdfsDirectoryFactory in the new
Hi,
Sharding by time by itself does not need any custom code on solr side:
start indexing your data to a shard, depending on the timestamp of your
document.
The querying part is trickier if you want to have one front end solr: it
should know which shards to query. If querying all shards for each
This is exactly what we did for a clients (alas using Elasticsearch). We
then observed better performance through SPM. We used the latest Oracle JVM.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Jun 7, 2013 2:55 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de
wrote:
Hi Shawn,
I
On Fri, Jun 7, 2013 at 7:32 AM, Erick Erickson erickerick...@gmail.com wrote:
I really question whether this is valuable. Much of Solr performance
is there explicitly because of caches
Right, and it's also the case that certain solr features are coded
with the cache in mind (i.e. they will be
AFAICT, SolrCloud addresses the use case of distributed update for a
relatively smaller number of collections (dozens?) that have a relatively
larger number of rows - billions over a modest to moderate number of nodes
(a handful to a dozen or dozens). So, maybe dozens of collections (some
Right, a search for 442 would not match 1442.
-- Jack Krupansky
-Original Message-
From: z z
Sent: Friday, June 07, 2013 2:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Schema Change: Int - String (i am the original poster, new
email address)
Maybe if I were to say that the
Yes, it SHOULD! And in the LucidWorks Search query parser it does. Why
doesn't it in Solr? Ask Yonik to explain that!
-- Jack Krupansky
-Original Message-
From: Rahul R
Sent: Friday, June 07, 2013 1:21 AM
To: solr-user@lucene.apache.org
Subject: Re: OR query with null value and
Hi All,
I work with Sandeep M, so continued to his comments. We did observe a memory
growth.
We use jdk1.6.0_45 with CMS. We see this issue because of large document
size. With large i mean our single document has large multivalued fields.
We found that JIRA LUCENE-4995
If you are trying to import an external XML file into your system, you
may want to look at DataImportHandler. It is a good way to start. Look
at Wikipedia examples.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is
It may be helpful to approach this from the other side. Specifically search.
Are you:
1) Expecting to search across both French and English content (e.g.
French, but fallback to English if translation is missing)? If yes,
you want a single collection
2) Is French content completely separate from
Hi
I am having an issue with adding pdf documents to a SolrCloud index I have
setup.
I can index pdf documents fine using 4.3.0 on my local box, but I have a
SolrCloud instance setup on the Amazon Cloud (Using 2 servers) and I get
Error.
It seems that it is not loading
Hi,
Can someone please tell me if there is a way to have a custom *`clustering
of the data`* from `solr` 'query' results? I am facing 2 issues currently:
1. The `*Carrot*` clustering only applies clustering to the paged
results (i.e. in the current pagination's page results).
2. I need to
This may help:
http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud
--- See Document Routing section.
-Original Message-
From: sathish_ix [mailto:skandhasw...@inautix.co.in]
Sent: Friday, June 07, 2013 5:27 AM
To: solr-user@lucene.apache.org
Subject: How to
: I don't think you want the noop bits, I'd go back to the
: standard definitions here.
Correct.
the NoOpDistributingUpdateProcessorFactory is for telling the update
processor chain that you do not want it to do any distribution of updates
at all -- whatever SolrCore you send the doc to, is
Hi Mark,
This is a total shot in the dark, but does
passing -Djava.awt.headless=true when you run the server help at all?
More on awt headless mode:
http://www.oracle.com/technetwork/articles/javase/headless-136834.html
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1
Thank you for the Clarification Shawn.
On Fri, Jun 7, 2013 at 7:34 PM, Jack Krupansky j...@basetechnology.comwrote:
Yes, it SHOULD! And in the LucidWorks Search query parser it does. Why
doesn't it in Solr? Ask Yonik to explain that!
-- Jack Krupansky
-Original Message- From:
Aleksey: What would you say is the average core size for your use case -
thousands or millions of rows? And how sharded would each of your
collections be, if at all?
Average core/collection size wouldn't even be thousands, hundreds more
like. And the largest would be half a million or so but
Thanks. That's what I suspected. Yes, MegaMiniCores.
My scenario is purely hypothetical. But it is also relevant for
multi-tenant use cases, where the users and schemas are not known in
advance and are only online intermittently.
Users could fit three rough size categories: very small,
Cool!
Having those values influenced by stats is a neat idea too. I'll get on that
soon.
Tim
-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Monday, June 03, 2013 5:07 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud Load Balancer weight
On Jun 3,
hello all,
environment: solr 3.5, centos
problem statement: i have several character codes that i want to translate
to ordinal (integer) values (for sorting), while retaining the original code
field in the document.
i was thinking that i could use a copyField from my code field to my ord
field
I'm a little confused here. Faceting is about counting docs that meet
your query restrictions. I.e. the q= and fq= clauses. So your original
problem statement simply cannot be combined into a single query
since your q= clauses are different. You could do something like
q=(firstterm OR
This won't help you unless you move to Solr 4.0, but here's an update
processor script from the book that can take the first character of a string
field and add it as an integer value for another field:
updateRequestProcessorChain name=script-add-char-code
processor
Also from the book, here's an alternative update request processor that uses
a JavaScript script to do the counting and field
creation:
updateRequestProcessorChain name=script-add-word-count
processor class=solr.StatelessScriptUpdateProcessorFactory
str name=scriptadd-word-count.js/str
hello jack,
thank you for the code ;)
what book are you referring to? AFAICT - all of the 4.0 books are future
order.
we won't be moving to 4.0 (soon enough).
so i take it - copyfield will not work, eg - i cannot take a code like ABC
and copy it to an int field and then use the regex to turn
Correct, you need either an update request processor, a custom field type,
or to preprocess your input before you give it to Solr.
You can't do analysis on a non-text field.
The book is my new Solr reference/guide that I will be self-publishing. We
hope to make an Alpha draft available later
I figured as much for atime, thanks Otis!
I haven't ran benchmarks just yet, but I'll be sure to share whatever I
find. I plan to try ext4 vs xfs.
I am also curious what effect disabling journaling (ext2) would have,
relying on SolrCloud to manage 'consistency' over many instances vs FS
If it makes you feel better, I also considered this approach when I was in
the same situation with a separate indexer and searcher on one Physical
linux machine.
My main concern was re-using the FS cache between both instances - If I
replicated to myself there would be two independent copies of
I have auto commit after 40k RECs/1800secs. But I only tested with manual
commit, but I don't see why it should work differently.
Roman
On 7 Jun 2013 20:52, Tim Vaillancourt t...@elementspace.com wrote:
If it makes you feel better, I also considered this approach when I was in
the same
thx,
please send me a link to the book so i get/purchase it.
thx
mark
--
View this message in context:
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068997.html
Sent from the Solr - User mailing list archive at Nabble.com.
can someone point me to a custom field tutorial.
i checked the wiki and this list - but still a little hazy on how i would do
this.
essentially - when the user issues a query, i want my class to interrogate a
string field (containing several codes - example boo, baz, bar)
and return a single
We set it up like this
+ individual solr instances are setup
+ external mapping/routing to allocate users to instances. This information
can be stored in an external data store
+ all cores are created as transient and loadonstart as false
+ cores come online on demand
+ as and when users data get
What are you trying to do? This seems really odd. I've been working in search
for fifteen years and I've never heard this request.
You could always return all the fields to the client and ignore the ones you
don't want.
wunder
On Jun 7, 2013, at 8:24 PM, geeky2 wrote:
can someone point me
Thank you for all reply members. Solve the issue.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Multitable-import-uniqueKey-tp4067796p4069007.html
Sent from the Solr - User mailing list archive at Nabble.com.
58 matches
Mail list logo