hi all
one thing I wanted to clear is for every other query I have got correct
suggestions but these 2 cases I am not getting what suppose to be the
suggestions:
1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed
in direct solr spell cheker..but when I query for cattle I
Hi Hoss,
we don't use autoCommit and autoSoftCommit.
We don't use openSearcher.
We don't use transaction log.
I can see it in the AdminGUI and with
http://master_host:port/solr/replication?command=indexversion
All files are replicated from master to slave, nothing lost.
It is just that the
Thanks Toke,
Seems to be exactly what I try to do.
Regards
Eric
Le 08/04/2013 20:02, Toke Eskildsen a écrit :
It-forum [it-fo...@meseo.fr]:
In exemple I have a product A this product is compatible with a Product
B version 1, 5, 6.
How can I index values like :
compatible_engine :
Hi,
We are comparing search request latency between Amazon Vs Dedicated
hosting [Rackspace] .For comparison we used solr version 3.6.1 and Amazon
small instance.The index size was less than 1GB.
We see that the latency is about 75 -100 % from Amazon. Any body who has
migrated form Dedicated
The XML files are formatted like this. I think there is the problem.
metadataContainerType
ns3:object
ns3:generic
ns3:provided
ns3:titleT0084-00371-DOWNLOAD - Blatt 184r/ns3:title
ns3:identifier
Looking a bit deeper showed that replication?command=commit reports the
right indexversion, generation and filelist.
arr name=commits
lst
long name=indexVersion1365357951589/long
long name=generation198/long
arr name=filelist
...
And with replication?command=details I also see the correct
On Tue, 2013-04-09 at 08:40 +0200, It-forum wrote:
Le 08/04/2013 20:02, Toke Eskildsen a écrit :
compatible_engine:productZ/85 to get all products compatible with productZ,
version 85
compatible_engine:productZ* to get all products compatible with any version
of productZ.
Whoops, slash
On 9 April 2013 13:10, Max Bo maximilian.brod...@gmail.com wrote:
The XML files are formatted like this. I think there is the problem.
[...]
Yes, to use curl to post to /solr/update you need to
have XML in the form described at
http://wiki.apache.org/solr/UpdateXmlMessages
Else, you can use
Hi,
thanks for your faster answer.
You don't use the Collection API - may I ask you why ?
Therefore you have to setup everything (replicas, ...) manually...,
which I would like to avoid.
Also what I don't understand, why my steps work in 4.0 but won't in 4.2.1...
Any clues ?
Kind Regards
Hi,
you are right, I have removed collection1 from the solr.xml but set
defaultCoreName=storage.
Also this works in 4.0 but won't in 4.2.1, any clues ?
Kind Regards
Alexander
Am 2013-04-08 20:06, schrieb Joel Bernstein:
The scenario above needs to have collection1 removed from the solr.xml
This question may not have a generel answer and may be open ended but is
there any commodity server spec. for a usual Solr running machine? I mean
what is the average server spesification for a Solr machine (i.e. Hadoop
running system it is not recommended to have very big storage capably
Hi Alexander,
I have put up a test case reproducing your issue. Perhaps someone more
familiar with faceting code can debug this.
For now, you can workaround this issue by adding facet.method=fc on your
queries.
On Mon, Apr 8, 2013 at 2:14 PM, Alexander Buhr a.b...@epages.com wrote:
Hello,
I am trying to modify the results of solr output . basically I need to
change the ranking of the output of solr for a query.
So please can anyone help.
I wrote a java code that returns the SolrDocumentList object which is a
union of the results I want this object to be displayed on solr.
hats
Hi Mark,
Is it possible to do a conditional query if another query has no results? For example, say I
want to search against a given field for:
- Search for car. If there are results, return them.
- Else, search for car* . If there are results, return them.
- Else, search for car~ . If
Good morning everyone,
I'm running solr 4.0 Final with ManifoldCF v1.2dev on tomcat 7.0.37 and I
had shards up and running on http but when I migrated to SSL it won't work
anymore.
First I got an IO Exception but then I changed my configuration in
solrconfig.xml to this:
requestHandler
On Tue, Apr 9, 2013 at 3:33 AM, Sujatha Arun suja.a...@gmail.com wrote:
Would a bigger instance improve latency?
Yes, and prewarming caches would help, too.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
We're setting up two collection aliases. One's a read alias, one's a
write alias.
When we need to start over with a new collection, we create the
collection alongside the original, and point the write alias at it.
When indexing is done, we point the read alias at it.
Then you can delete the old
We do this on the client side with multiple queries. It is fairly efficient,
because most responses are from the first, exact query.
wunder
On Apr 9, 2013, at 6:15 AM, Koji Sekiguchi wrote:
Hi Mark,
Is it possible to do a conditional query if another query has no results?
For example,
I have around 100M of textual document geotagged (lat,long). THese
documents are indexed with Solr 1.4. I am testing a retrieval model
(written over Terrier). This model requires frequent execution of
queries ( Bounding-box filter). These queries could be executed in
parallel, one for each
Sorry, I didnt explain my self good, I mean , you have to create an
additional field 'hasCategory' in your schema, and then, before indexing,
set the field 'hasCategory' in the indexed document as true, if your
document has categories, or set it to false, if it has any. With this you
will save
Hi guys,
I'm getting exceptions in a Solr slave, when accessing TermVector component
and RealTimeGetHandler. The weird thing is, that in the master and in one of
the 2 slaves, the documents are ok, and the same query doesnt return any
exception. For now, the only way I have to solve the problem
sorry I forgot to say, the exceptions are not for every document, but only
for a few...
regards,
Victor
Victor Ruiz wrote
Hi guys,
I'm getting exceptions in a Solr slave, when accessing TermVector
component and RealTimeGetHandler. The weird thing is, that in the master
and in one of the 2
: And with replication?command=details I also see the correct commit part as
: above, BUT where the hell are the wrong info below the commit array are
: coming from?
Please read the details in the previously mentioned Jira issue...
https://issues.apache.org/jira/browse/SOLR-4661
The
Hi all,
I have been working through the examples on the SolrCloud page:
http://wiki.apache.org/solr/SolrCloud
I am now at the point where, rather than firing up Solr through start.jar,
I'm deploying the Solr war in to Tomcat instances. Taking the following
command as an example:
java
I not sure, but you can create a class extend of SearchComponent and
include at the least of your requesthandler and in this way add optional
actions about whatever query on your solr server.
Example solrconfig.xml
requestHandler
...
arr name=last-components
stractions/str
/arr
Hi All,
I am migrating from Solr 3.5.0 to Solr 4.2.1. And everything is running
fine and set to go, except the master slave replication.
We use master slave replication with multi cores ( 1 master, 10 slaves and
20 plus cores).
My Configuration is :
Master : Solr 3.5.0, Has existing index,
Hello,
I am using the Result Grouping feature with SolrCloud, and it seems that
grouping does not work with field types having precisionStep property
greater than 0, in distributed mode.
I updated the SolrCloud - Getting Started page example A (Simple two
shard cluster).
In my schema.xml, the
Hi,
I'd move to SolrCloud 4.2.1 to benefit from sharding, replication, and
the latest Lucene. How many queries you will then be able to run in
parallel will depend on their complexity, index size, query
cachability, index size, latency requirements... But move to the
latest setup first.
Otis
--
Hi Sujatha,
You should really do the same stuff to improve latency in the cloud as
what you would do on a dedicated server.
Amazon-specific stuff:
Bigger EC2 instances have better IO. EBS performance varies. Some
people mount N of them and stripe across them. Some people try N EBS
volumes to
Hi All,
Deploying Solr 4.2.1 to GlassFish 3.1.1 results in the error below. I have
seen similar problems being reported with Solr 4.2
and my take-away was that 4.2.1 contains the necessary fix.
Any help with this will be appreciated.
Thanks!
2013-04-09 10:45:06,144 [main] ERROR
: Deploying Solr 4.2.1 to GlassFish 3.1.1 results in the error below. I
: have seen similar problems being reported with Solr 4.2
Are you trying to use server SSL with glassfish?
can you please post the full stack trace so we can see where this error is
coming from.
My best guess is that
: I'd move to SolrCloud 4.2.1 to benefit from sharding, replication, and
: the latest Lucene. How many queries you will then be able to run in
: parallel will depend on their complexity, index size, query
: cachability, index size, latency requirements... But move to the
: latest setup first.
hi all
wanted to know what could be the difference between the results if I apply
boost accross say 5 fields in query like for
first: title^10.0 features^7.0 cat^5.0 color^3.0 root^1.0 and
second settings like : title^10.0 features^5.0 cat^3.0 color^2.0 root^1.0
what could be the difference as
Not sure if i'm missing something but in the first case features, cat,
and color field have more weight, so matches on them with have bigger
contribution to the overall relevancy score.
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Tue, Apr 9, 2013 at 1:52 PM, Rohan Thakur
Hi,
You are right there is no average. I saw a Solr cluster with a
few EC2 micro instances yesterday and regularly see Solr running on 16
or 32 GB RAM and sometimes well over 100 GB RAM. Sometimes they have
just 2 CPU cores, sometimes 32 or more. Some use SSDs, some HDDs,
some local
We mostly run m1.xlarge with an 8GB heap. --wunder
On Apr 9, 2013, at 10:57 AM, Otis Gospodnetic wrote:
Hi,
You are right there is no average. I saw a Solr cluster with a
few EC2 micro instances yesterday and regularly see Solr running on 16
or 32 GB RAM and sometimes well over 100 GB
Hi,
I wrote a test of my application which revealed a Solr oddity (I think).
The test which I wrote on Windows 7 and makes use of the
solr-test-frameworkhttp://lucene.apache.org/solr/4_1_0/solr-test-framework/index.html
fails
under Ubuntu 12.04 because the Solr results I expected for a wildcard
In Ubuntu, I've added it to /etc/default/tomcat7 in the JAVA_OPTS options.
For example, I have:
JAVA_OPTS=-Djava.awt.headless=true -Xmx2048m -XX:+UseConcMarkSweepGC
JAVA_OPTS=${JAVA_OPTS} -DnumShards=2 -Djetty.port=8080
-DzkHost=zookeeper01.dev.:2181 -Dboostrap_conf=true
--
Nate Fox
Sr Systems
Hi Edd;
The parameters you mentioned are JVM parameters. There are two ways to
define them.
First one is if you are using an IDE you can indicate them as JVM
parameters. i.e. if you are using Intellij IDEA when you click your
Run/Debug configurations there is a line called VM Options. You can
You may also be interested in looking at things like solrbase (on Github).
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI furkankam...@gmail.com wrote:
Hi;
First of all should mention that I am new to Solr and making a research
Hi Walter;
Could I learn that what is the average size of Solr indexes and average
query per second to your Solr. Maybe I can come up with an assumption?
2013/4/9 Walter Underwood wun...@wunderwood.org
We mostly run m1.xlarge with an 8GB heap. --wunder
On Apr 9, 2013, at 10:57 AM, Otis
Hello,
I'm trying to index a large number of documents in different languages.
I don't know the language of the document, so I'm using
TikaLanguageIdentifierUpdateProcessorFactory to identify it.
So, this is my configuration in solrconfig.xml
updateRequestProcessorChain name=langid
My main concern was just making sure we were getting the best search
performance, and that we did not have too many segments. Every attempt I
made to adjust the segment count resulted in no difference (segment
count never changed). Looking at that blog page, it looks like 30-40
segments is
Hi,
Typically people try to figure out the query language somehow.
Queries are short, so LID on them is hard. But user profile could
indicate a language, or users can be asked and such.
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Tue, Apr 9, 2013 at 2:32 PM,
If it isn't obvious, I'm glad to help test a patch for this. We can run a
simulated production load in dev and report to our metrics server.
wunder
On Apr 8, 2013, at 1:07 PM, Walter Underwood wrote:
That approach sounds great. --wunder
On Apr 7, 2013, at 9:40 AM, Alan Woodward wrote:
Have you looked at edismax and the 'qf' fields parameter? It allows you to
define the fields to search. Also, you can define those parameters in
solrconfig.xml and not have to send them down the wire.
Finally, you can define several different request handlers (e.g. /ensearch,
/frsearch) and have
Thanks for replying.
My config:
- 40 dedicated servers, dual-core each
- Running Tomcat servlet on Linux
- 12 Gb RAM per server, splitted half between OS and Solr
- Complex queries (up to 30 conditions on different fields), 1 qps rate
Sharding my index was done for two reasons, based
On 4/9/2013 12:08 PM, P Williams wrote:
I wrote a test of my application which revealed a Solr oddity (I think).
The test which I wrote on Windows 7 and makes use of the
solr-test-frameworkhttp://lucene.apache.org/solr/4_1_0/solr-test-framework/index.html
fails
under Ubuntu 12.04 because the
On 4/9/2013 2:10 PM, Manuel Le Normand wrote:
Thanks for replying.
My config:
- 40 dedicated servers, dual-core each
- Running Tomcat servlet on Linux
- 12 Gb RAM per server, splitted half between OS and Solr
- Complex queries (up to 30 conditions on different fields), 1 qps
Thanks for the replies. The problem I have is that setting them at the JVM
level would mean that all instances of Solr deployed in the Tomcat instance
are forced to use the same settings. I actually want to set the properties
at the application level (e.g. in solr.xml, zoo.conf or maybe an
Hey Shawn,
My gut says the difference in assignment of docids has to do with how the
FileListEntityProcessorhttp://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor
works
on the two operating systems. The documents are updated/imported in a
different order is my guess, but I haven't
Hi Shawn;
You say that:
*... your documents are about 50KB each. That would translate to an index
that's at least 25GB*
I know we can not say an exact size but what is the approximately ratio of
document size / index size according to your experiences?
2013/4/9 Shawn Heisey s...@elyograg.org
Are there anybody who can help me about how to guess the approximately
needed RAM for 5000 query/second at a Solr machine?
It all depends on the nature of your query and the nature of the data in the
index. Does returning results from a result cache count in your QPS? Not to
mention how many cores and CPU speed and CPU caching as well. Not to mention
network latency.
The best way to answer is to do a proof of
Actually I will propose a system and I should figure out about machine
specifications. There will be no faceting mechanism at first, just simple
search queries of a web search engine. We can think that I will have a
commodity server (I don't know is there any benchmark for a usual Solr
machine)
On Apr 9, 2013, at 3:06 PM, Furkan KAMACI wrote:
Are there anybody who can help me about how to guess the approximately
needed RAM for 5000 query/second at a Solr machine?
No.
That depends on the kind of queries you have, the size and content of the
index, the required response time, how
Hi Walter;
Firstly thank for your detailed reply. I know that this is not a well
detailed question but I don't have any metrics yet. If we talk about your
system, what is the average RAM size of your Solr machines? Maybe that can
help me to make a comparison.
2013/4/10 Walter Underwood
We are using Amazon EC2 M1 Extra Large instances (m1.xlarge).
http://aws.amazon.com/ec2/instance-types/
wunder
On Apr 9, 2013, at 3:35 PM, Furkan KAMACI wrote:
Hi Walter;
Firstly thank for your detailed reply. I know that this is not a well
detailed question but I don't have any metrics
Thanks for your answer.
2013/4/10 Walter Underwood wun...@wunderwood.org
We are using Amazon EC2 M1 Extra Large instances (m1.xlarge).
http://aws.amazon.com/ec2/instance-types/
wunder
On Apr 9, 2013, at 3:35 PM, Furkan KAMACI wrote:
Hi Walter;
Firstly thank for your detailed reply.
If anybody could still help me out with this, I'd really appreciate it.
Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Pushing-a-whole-set-of-pdf-files-to-solr-tp4025256p4054885.html
Sent from the Solr - User mailing list archive at Nabble.com.
Apache Solr 4 Cookbok says that:
curl http://localhost:8983/solr/update/extract?literal.id=1commit=true;
-F myfile=@cookbook.pdf
is that what you want?
2013/4/10 sdspieg sdsp...@mail.ru
If anybody could still help me out with this, I'd really appreciate it.
Thanks!
--
View this message
The newer release of SimplePostTool with Solr 4.x makes it easy to post PDF
files from a directory, including automatically adding the file name to a
field. But SolrCell is the direct API that it uses as well.
-- Jack Krupansky
-Original Message-
From: Furkan KAMACI
Sent: Tuesday,
On 4/9/2013 3:50 PM, Furkan KAMACI wrote:
Hi Shawn;
You say that:
*... your documents are about 50KB each. That would translate to an index
that's at least 25GB*
I know we can not say an exact size but what is the approximately ratio of
document size / index size according to your
: Thanks for the replies. The problem I have is that setting them at the JVM
: level would mean that all instances of Solr deployed in the Tomcat instance
: are forced to use the same settings. I actually want to set the properties
: at the application level (e.g. in solr.xml, zoo.conf or maybe an
Raymond Wiker wrote
You have misspelt the tag name in the field definition... you have fiald
instead of field.
thank you Raymond, it was really hard to find it out in a massive schema
file
-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context:
On 4/9/2013 4:06 PM, Furkan KAMACI wrote:
Are there anybody who can help me about how to guess the approximately
needed RAM for 5000 query/second at a Solr machine?
You've already gotten some good replies, and I'm aware that they haven't
really answered your question. This is the kind of
These are really good metrics for me:
You say that RAM size should be at least index size, and it is better to
have a RAM size twice the index size (because of worst case scenario).
On the other hand let's assume that I have a RAM size that is bigger than
twice of indexes at machine. Can Solr
: My gut says the difference in assignment of docids has to do with how the
:
FileListEntityProcessorhttp://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor
docids just represent the order documents are added to the index. if you
use DIH with FileListEntityProcessor to create
Thanks for those replies. I will look into them. But if anyone knows of a
site that describes step by step how a windows user who has already
installed solr (and tomcat) can easily feed a folder (and subfolders) with
100s of pdfs into solr, or would be willing to write down down those steps,
I
I am able to run the java -jar post.jar -help command which I found here:
http://docs.lucidworks.com/display/solr/Running+Solr. But now how can I tell
post to post all pdf files in a certain folder (preferably recursively) to a
collection? Could anybody please post the exact command for that?
Hi Chris,
Thanks for your response.
My understanding is that GlassFish specifies the keystore as a system property,
but does not specify the password in order to protect it from
snooping. There's
a keychain that requires a password to be passed from the DAS in order to
unlock the key for the
On 10 April 2013 07:28, sdspieg sdsp...@mail.ru wrote:
I am able to run the java -jar post.jar -help command which I found here:
http://docs.lucidworks.com/display/solr/Running+Solr. But now how can I tell
post to post all pdf files in a certain folder (preferably recursively) to a
collection?
Another progress report. I 'flattened' all the folders which contained the
pdf files with Fileboss and then moved the pdf files to the directory where
I found the post.jar file (in solr-4.2.1\solr-4.2.1\example\exampledocs). I
then ran java -Ddata=files -jar post.jar *.pdf and in the command
On 4/9/2013 7:03 PM, Furkan KAMACI wrote:
These are really good metrics for me:
You say that RAM size should be at least index size, and it is better to
have a RAM size twice the index size (because of worst case scenario).
On the other hand let's assume that I have a RAM size that is
On 10 April 2013 08:11, sdspieg sdsp...@mail.ru wrote:
Another progress report. I 'flattened' all the folders which contained the
pdf files with Fileboss and then moved the pdf files to the directory where
I found the post.jar file (in solr-4.2.1\solr-4.2.1\example\exampledocs). I
then ran
The newer SimplePostTool can in fact recurse a directory of PDFs. Just get
the usage for the tool. I'm sure it lists the command options.
-- Jack Krupansky
-Original Message-
From: sdspieg
Sent: Tuesday, April 09, 2013 9:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Pushing a
Adding debugQuery=true is your friend. I suspect that you'll find
your first query is actually searching
name:coldfusion OR defaultsearchfield:cache and you _think_ it's
searching for both coldfusion and cache in the name field
Best
Erick
On Mon, Apr 8, 2013 at 2:50 AM, amit
I am sorry but you said:
*you need enough free RAM for the OS to cache the maximum amount of disk
space all your indexes will ever use*
I have made an assumption my indexes at my machine. Let's assume that it is
5 GB. So it is better to have at least 5 GB RAM? OK, Solr will use RAM up
to how
On 4/9/2013 9:12 PM, Furkan KAMACI wrote:
I am sorry but you said:
*you need enough free RAM for the OS to cache the maximum amount of disk
space all your indexes will ever use*
I have made an assumption my indexes at my machine. Let's assume that it is
5 GB. So it is better to have at
Please update?
-Original Message-
From: Sandeep Kumar Anumalla
Sent: 31 March, 2013 12:08 PM
To: solr-user@lucene.apache.org
Cc: 'Joel Bernstein'
Subject: RE: Solr index Backup and restore of large indexs
Hi,
I am exploring all the possible options.
We want to distribute 1 TB traffic
hi otis
can you explain that in some depth like If is search for led in both the
cases what could be the difference in the results I get?
thanks in advance
regards
Rohan
On Tue, Apr 9, 2013 at 11:25 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Not sure if i'm missing something but
HI Erick,
My main point is if I use replication I have to use similar kind of setup
(Hardware, storage space) as such as the Master, it more cost effective, that
is why I am looking at incremental backup options, so that I can keep these
backup any place like external Hard disks, tapes.
And
82 matches
Mail list logo