spell suggestions help

2013-04-09 Thread Rohan Thakur
hi all one thing I wanted to clear is for every other query I have got correct suggestions but these 2 cases I am not getting what suppose to be the suggestions: 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed in direct solr spell cheker..but when I query for cattle I

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-09 Thread Bernd Fehling
Hi Hoss, we don't use autoCommit and autoSoftCommit. We don't use openSearcher. We don't use transaction log. I can see it in the AdminGUI and with http://master_host:port/solr/replication?command=indexversion All files are replicated from master to slave, nothing lost. It is just that the

Re: Sub field indexing

2013-04-09 Thread It-forum
Thanks Toke, Seems to be exactly what I try to do. Regards Eric Le 08/04/2013 20:02, Toke Eskildsen a écrit : It-forum [it-fo...@meseo.fr]: In exemple I have a product A this product is compatible with a Product B version 1, 5, 6. How can I index values like : compatible_engine :

Latency Comparison between cloud hosting Vs Dedicated hosting

2013-04-09 Thread Sujatha Arun
Hi, We are comparing search request latency between Amazon Vs Dedicated hosting [Rackspace] .For comparison we used solr version 3.6.1 and Amazon small instance.The index size was less than 1GB. We see that the latency is about 75 -100 % from Amazon. Any body who has migrated form Dedicated

Re: Indexed data not searchable

2013-04-09 Thread Max Bo
The XML files are formatted like this. I think there is the problem. metadataContainerType ns3:object ns3:generic ns3:provided ns3:titleT0084-00371-DOWNLOAD - Blatt 184r/ns3:title ns3:identifier

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-09 Thread Bernd Fehling
Looking a bit deeper showed that replication?command=commit reports the right indexversion, generation and filelist. arr name=commits lst long name=indexVersion1365357951589/long long name=generation198/long arr name=filelist ... And with replication?command=details I also see the correct

Re: Sub field indexing

2013-04-09 Thread Toke Eskildsen
On Tue, 2013-04-09 at 08:40 +0200, It-forum wrote: Le 08/04/2013 20:02, Toke Eskildsen a écrit : compatible_engine:productZ/85 to get all products compatible with productZ, version 85 compatible_engine:productZ* to get all products compatible with any version of productZ. Whoops, slash

Re: Indexed data not searchable

2013-04-09 Thread Gora Mohanty
On 9 April 2013 13:10, Max Bo maximilian.brod...@gmail.com wrote: The XML files are formatted like this. I think there is the problem. [...] Yes, to use curl to post to /solr/update you need to have XML in the form described at http://wiki.apache.org/solr/UpdateXmlMessages Else, you can use

Re: Empty Solr 4.2.1 can not create Collection

2013-04-09 Thread A.Eibner
Hi, thanks for your faster answer. You don't use the Collection API - may I ask you why ? Therefore you have to setup everything (replicas, ...) manually..., which I would like to avoid. Also what I don't understand, why my steps work in 4.0 but won't in 4.2.1... Any clues ? Kind Regards

Re: Empty Solr 4.2.1 can not create Collection

2013-04-09 Thread A.Eibner
Hi, you are right, I have removed collection1 from the solr.xml but set defaultCoreName=storage. Also this works in 4.0 but won't in 4.2.1, any clues ? Kind Regards Alexander Am 2013-04-08 20:06, schrieb Joel Bernstein: The scenario above needs to have collection1 removed from the solr.xml

Average Solr Server Spec.

2013-04-09 Thread Furkan KAMACI
This question may not have a generel answer and may be open ended but is there any commodity server spec. for a usual Solr running machine? I mean what is the average server spesification for a Solr machine (i.e. Hadoop running system it is not recommended to have very big storage capably

Re: SOLR-4581

2013-04-09 Thread Shalin Shekhar Mangar
Hi Alexander, I have put up a test case reproducing your issue. Perhaps someone more familiar with faceting code can debug this. For now, you can workaround this issue by adding facet.method=fc on your queries. On Mon, Apr 8, 2013 at 2:14 PM, Alexander Buhr a.b...@epages.com wrote: Hello,

Doc Transformer with SolrDocumentList object

2013-04-09 Thread neha yadav
I am trying to modify the results of solr output . basically I need to change the ranking of the output of solr for a query. So please can anyone help. I wrote a java code that returns the SolrDocumentList object which is a union of the results I want this object to be displayed on solr. hats

Re: conditional queries?

2013-04-09 Thread Koji Sekiguchi
Hi Mark, Is it possible to do a conditional query if another query has no results? For example, say I want to search against a given field for: - Search for car. If there are results, return them. - Else, search for car* . If there are results, return them. - Else, search for car~ . If

How to configure shards with SSL?

2013-04-09 Thread eShard
Good morning everyone, I'm running solr 4.0 Final with ManifoldCF v1.2dev on tomcat 7.0.37 and I had shards up and running on http but when I migrated to SSL it won't work anymore. First I got an IO Exception but then I changed my configuration in solrconfig.xml to this: requestHandler

Re: Latency Comparison between cloud hosting Vs Dedicated hosting

2013-04-09 Thread Michael Della Bitta
On Tue, Apr 9, 2013 at 3:33 AM, Sujatha Arun suja.a...@gmail.com wrote: Would a bigger instance improve latency? Yes, and prewarming caches would help, too. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271

Re: Best practice for rebuild index in SolrCloud

2013-04-09 Thread Michael Della Bitta
We're setting up two collection aliases. One's a read alias, one's a write alias. When we need to start over with a new collection, we create the collection alongside the original, and point the write alias at it. When indexing is done, we point the read alias at it. Then you can delete the old

Re: conditional queries?

2013-04-09 Thread Walter Underwood
We do this on the client side with multiple queries. It is fairly efficient, because most responses are from the first, exact query. wunder On Apr 9, 2013, at 6:15 AM, Koji Sekiguchi wrote: Hi Mark, Is it possible to do a conditional query if another query has no results? For example,

Execution of Queries in Parallel: geotagged textual documents in Solrvvvv

2013-04-09 Thread Massimiliano Ruocco
I have around 100M of textual document geotagged (lat,long). THese documents are indexed with Solr 1.4. I am testing a retrieval model (written over Terrier). This model requires frequent execution of queries ( Bounding-box filter). These queries could be executed in parallel, one for each

Re: Search data who does not have x field

2013-04-09 Thread Victor Ruiz
Sorry, I didnt explain my self good, I mean , you have to create an additional field 'hasCategory' in your schema, and then, before indexing, set the field 'hasCategory' in the indexed document as true, if your document has categories, or set it to false, if it has any. With this you will save

corrupted index in slave?

2013-04-09 Thread Victor Ruiz
Hi guys, I'm getting exceptions in a Solr slave, when accessing TermVector component and RealTimeGetHandler. The weird thing is, that in the master and in one of the 2 slaves, the documents are ok, and the same query doesnt return any exception. For now, the only way I have to solve the problem

Re: corrupted index in slave?

2013-04-09 Thread Victor Ruiz
sorry I forgot to say, the exceptions are not for every document, but only for a few... regards, Victor Victor Ruiz wrote Hi guys, I'm getting exceptions in a Solr slave, when accessing TermVector component and RealTimeGetHandler. The weird thing is, that in the master and in one of the 2

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-09 Thread Chris Hostetter
: And with replication?command=details I also see the correct commit part as : above, BUT where the hell are the wrong info below the commit array are : coming from? Please read the details in the previously mentioned Jira issue... https://issues.apache.org/jira/browse/SOLR-4661 The

How can I set configuration options?

2013-04-09 Thread Edd Grant
Hi all, I have been working through the examples on the SolrCloud page: http://wiki.apache.org/solr/SolrCloud I am now at the point where, rather than firing up Solr through start.jar, I'm deploying the Solr war in to Tomcat instances. Taking the following command as an example: java

Re: conditional queries?

2013-04-09 Thread Miguel
I not sure, but you can create a class extend of SearchComponent and include at the least of your requesthandler and in this way add optional actions about whatever query on your solr server. Example solrconfig.xml requestHandler ... arr name=last-components stractions/str /arr

Index Replication Failing in Solr 4.2.1

2013-04-09 Thread Umesh Prasad
Hi All, I am migrating from Solr 3.5.0 to Solr 4.2.1. And everything is running fine and set to go, except the master slave replication. We use master slave replication with multi cores ( 1 master, 10 slaves and 20 plus cores). My Configuration is : Master : Solr 3.5.0, Has existing index,

SolrCloud: Result Grouping - no groups with field type with precisionStep 0

2013-04-09 Thread Elodie Sannier
Hello, I am using the Result Grouping feature with SolrCloud, and it seems that grouping does not work with field types having precisionStep property greater than 0, in distributed mode. I updated the SolrCloud - Getting Started page example A (Simple two shard cluster). In my schema.xml, the

Re: Execution of Queries in Parallel: geotagged textual documents in Solrvvvv

2013-04-09 Thread Otis Gospodnetic
Hi, I'd move to SolrCloud 4.2.1 to benefit from sharding, replication, and the latest Lucene. How many queries you will then be able to run in parallel will depend on their complexity, index size, query cachability, index size, latency requirements... But move to the latest setup first. Otis --

Re: Latency Comparison between cloud hosting Vs Dedicated hosting

2013-04-09 Thread Otis Gospodnetic
Hi Sujatha, You should really do the same stuff to improve latency in the cloud as what you would do on a dedicated server. Amazon-specific stuff: Bigger EC2 instances have better IO. EBS performance varies. Some people mount N of them and stripe across them. Some people try N EBS volumes to

Solr 4.2.1 SSLInitializationException

2013-04-09 Thread Sarita Nair
Hi All, Deploying Solr 4.2.1 to GlassFish 3.1.1 results in the error below.  I have seen similar problems being reported with Solr 4.2 and my take-away was that 4.2.1 contains the necessary fix. Any help with this will be appreciated. Thanks!     2013-04-09 10:45:06,144 [main] ERROR

Re: Solr 4.2.1 SSLInitializationException

2013-04-09 Thread Chris Hostetter
: Deploying Solr 4.2.1 to GlassFish 3.1.1 results in the error below.  I : have seen similar problems being reported with Solr 4.2 Are you trying to use server SSL with glassfish? can you please post the full stack trace so we can see where this error is coming from. My best guess is that

Re: Execution of Queries in Parallel: geotagged textual documents in Solrvvvv

2013-04-09 Thread Chris Hostetter
: I'd move to SolrCloud 4.2.1 to benefit from sharding, replication, and : the latest Lucene. How many queries you will then be able to run in : parallel will depend on their complexity, index size, query : cachability, index size, latency requirements... But move to the : latest setup first.

query regarding the use of boost across the fields in edismax query

2013-04-09 Thread Rohan Thakur
hi all wanted to know what could be the difference between the results if I apply boost accross say 5 fields in query like for first: title^10.0 features^7.0 cat^5.0 color^3.0 root^1.0 and second settings like : title^10.0 features^5.0 cat^3.0 color^2.0 root^1.0 what could be the difference as

Re: query regarding the use of boost across the fields in edismax query

2013-04-09 Thread Otis Gospodnetic
Not sure if i'm missing something but in the first case features, cat, and color field have more weight, so matches on them with have bigger contribution to the overall relevancy score. Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 9, 2013 at 1:52 PM, Rohan Thakur

Re: Average Solr Server Spec.

2013-04-09 Thread Otis Gospodnetic
Hi, You are right there is no average. I saw a Solr cluster with a few EC2 micro instances yesterday and regularly see Solr running on 16 or 32 GB RAM and sometimes well over 100 GB RAM. Sometimes they have just 2 CPU cores, sometimes 32 or more. Some use SSDs, some HDDs, some local

Re: Average Solr Server Spec.

2013-04-09 Thread Walter Underwood
We mostly run m1.xlarge with an 8GB heap. --wunder On Apr 9, 2013, at 10:57 AM, Otis Gospodnetic wrote: Hi, You are right there is no average. I saw a Solr cluster with a few EC2 micro instances yesterday and regularly see Solr running on 16 or 32 GB RAM and sometimes well over 100 GB

Results Order When Performing Wildcard Query

2013-04-09 Thread P Williams
Hi, I wrote a test of my application which revealed a Solr oddity (I think). The test which I wrote on Windows 7 and makes use of the solr-test-frameworkhttp://lucene.apache.org/solr/4_1_0/solr-test-framework/index.html fails under Ubuntu 12.04 because the Solr results I expected for a wildcard

Re: How can I set configuration options?

2013-04-09 Thread Nate Fox
In Ubuntu, I've added it to /etc/default/tomcat7 in the JAVA_OPTS options. For example, I have: JAVA_OPTS=-Djava.awt.headless=true -Xmx2048m -XX:+UseConcMarkSweepGC JAVA_OPTS=${JAVA_OPTS} -DnumShards=2 -Djetty.port=8080 -DzkHost=zookeeper01.dev.:2181 -Dboostrap_conf=true -- Nate Fox Sr Systems

Re: How can I set configuration options?

2013-04-09 Thread Furkan KAMACI
Hi Edd; The parameters you mentioned are JVM parameters. There are two ways to define them. First one is if you are using an IDE you can indicate them as JVM parameters. i.e. if you are using Intellij IDEA when you click your Run/Debug configurations there is a line called VM Options. You can

Re: Pointing to Hbase for Docuements or Directly Saving Documents at Hbase

2013-04-09 Thread Otis Gospodnetic
You may also be interested in looking at things like solrbase (on Github). Otis -- Solr ElasticSearch Support http://sematext.com/ On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi; First of all should mention that I am new to Solr and making a research

Re: Average Solr Server Spec.

2013-04-09 Thread Furkan KAMACI
Hi Walter; Could I learn that what is the average size of Solr indexes and average query per second to your Solr. Maybe I can come up with an assumption? 2013/4/9 Walter Underwood wun...@wunderwood.org We mostly run m1.xlarge with an 8GB heap. --wunder On Apr 9, 2013, at 10:57 AM, Otis

Indexing and searching documents in different languages

2013-04-09 Thread dev
Hello, I'm trying to index a large number of documents in different languages. I don't know the language of the document, so I'm using TikaLanguageIdentifierUpdateProcessorFactory to identify it. So, this is my configuration in solrconfig.xml updateRequestProcessorChain name=langid

Re: Number of segments

2013-04-09 Thread Michael Long
My main concern was just making sure we were getting the best search performance, and that we did not have too many segments. Every attempt I made to adjust the segment count resulted in no difference (segment count never changed). Looking at that blog page, it looks like 30-40 segments is

Re: Indexing and searching documents in different languages

2013-04-09 Thread Otis Gospodnetic
Hi, Typically people try to figure out the query language somehow. Queries are short, so LID on them is hard. But user profile could indicate a language, or users can be asked and such. Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 9, 2013 at 2:32 PM,

Re: Solr metrics in Codahale metrics and Graphite?

2013-04-09 Thread Walter Underwood
If it isn't obvious, I'm glad to help test a patch for this. We can run a simulated production load in dev and report to our metrics server. wunder On Apr 8, 2013, at 1:07 PM, Walter Underwood wrote: That approach sounds great. --wunder On Apr 7, 2013, at 9:40 AM, Alan Woodward wrote:

Re: Indexing and searching documents in different languages

2013-04-09 Thread Alexandre Rafalovitch
Have you looked at edismax and the 'qf' fields parameter? It allows you to define the fields to search. Also, you can define those parameters in solrconfig.xml and not have to send them down the wire. Finally, you can define several different request handlers (e.g. /ensearch, /frsearch) and have

Re: Slow qTime for distributed search

2013-04-09 Thread Manuel Le Normand
Thanks for replying. My config: - 40 dedicated servers, dual-core each - Running Tomcat servlet on Linux - 12 Gb RAM per server, splitted half between OS and Solr - Complex queries (up to 30 conditions on different fields), 1 qps rate Sharding my index was done for two reasons, based

Re: Results Order When Performing Wildcard Query

2013-04-09 Thread Shawn Heisey
On 4/9/2013 12:08 PM, P Williams wrote: I wrote a test of my application which revealed a Solr oddity (I think). The test which I wrote on Windows 7 and makes use of the solr-test-frameworkhttp://lucene.apache.org/solr/4_1_0/solr-test-framework/index.html fails under Ubuntu 12.04 because the

Re: Slow qTime for distributed search

2013-04-09 Thread Shawn Heisey
On 4/9/2013 2:10 PM, Manuel Le Normand wrote: Thanks for replying. My config: - 40 dedicated servers, dual-core each - Running Tomcat servlet on Linux - 12 Gb RAM per server, splitted half between OS and Solr - Complex queries (up to 30 conditions on different fields), 1 qps

Re: How can I set configuration options?

2013-04-09 Thread Edd Grant
Thanks for the replies. The problem I have is that setting them at the JVM level would mean that all instances of Solr deployed in the Tomcat instance are forced to use the same settings. I actually want to set the properties at the application level (e.g. in solr.xml, zoo.conf or maybe an

Re: Results Order When Performing Wildcard Query

2013-04-09 Thread P Williams
Hey Shawn, My gut says the difference in assignment of docids has to do with how the FileListEntityProcessorhttp://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor works on the two operating systems. The documents are updated/imported in a different order is my guess, but I haven't

Re: Slow qTime for distributed search

2013-04-09 Thread Furkan KAMACI
Hi Shawn; You say that: *... your documents are about 50KB each. That would translate to an index that's at least 25GB* I know we can not say an exact size but what is the approximately ratio of document size / index size according to your experiences? 2013/4/9 Shawn Heisey s...@elyograg.org

Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
Are there anybody who can help me about how to guess the approximately needed RAM for 5000 query/second at a Solr machine?

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Jack Krupansky
It all depends on the nature of your query and the nature of the data in the index. Does returning results from a result cache count in your QPS? Not to mention how many cores and CPU speed and CPU caching as well. Not to mention network latency. The best way to answer is to do a proof of

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
Actually I will propose a system and I should figure out about machine specifications. There will be no faceting mechanism at first, just simple search queries of a web search engine. We can think that I will have a commodity server (I don't know is there any benchmark for a usual Solr machine)

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Walter Underwood
On Apr 9, 2013, at 3:06 PM, Furkan KAMACI wrote: Are there anybody who can help me about how to guess the approximately needed RAM for 5000 query/second at a Solr machine? No. That depends on the kind of queries you have, the size and content of the index, the required response time, how

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
Hi Walter; Firstly thank for your detailed reply. I know that this is not a well detailed question but I don't have any metrics yet. If we talk about your system, what is the average RAM size of your Solr machines? Maybe that can help me to make a comparison. 2013/4/10 Walter Underwood

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Walter Underwood
We are using Amazon EC2 M1 Extra Large instances (m1.xlarge). http://aws.amazon.com/ec2/instance-types/ wunder On Apr 9, 2013, at 3:35 PM, Furkan KAMACI wrote: Hi Walter; Firstly thank for your detailed reply. I know that this is not a well detailed question but I don't have any metrics

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
Thanks for your answer. 2013/4/10 Walter Underwood wun...@wunderwood.org We are using Amazon EC2 M1 Extra Large instances (m1.xlarge). http://aws.amazon.com/ec2/instance-types/ wunder On Apr 9, 2013, at 3:35 PM, Furkan KAMACI wrote: Hi Walter; Firstly thank for your detailed reply.

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread sdspieg
If anybody could still help me out with this, I'd really appreciate it. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Pushing-a-whole-set-of-pdf-files-to-solr-tp4025256p4054885.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread Furkan KAMACI
Apache Solr 4 Cookbok says that: curl http://localhost:8983/solr/update/extract?literal.id=1commit=true; -F myfile=@cookbook.pdf is that what you want? 2013/4/10 sdspieg sdsp...@mail.ru If anybody could still help me out with this, I'd really appreciate it. Thanks! -- View this message

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread Jack Krupansky
The newer release of SimplePostTool with Solr 4.x makes it easy to post PDF files from a directory, including automatically adding the file name to a field. But SolrCell is the direct API that it uses as well. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Tuesday,

Re: Slow qTime for distributed search

2013-04-09 Thread Shawn Heisey
On 4/9/2013 3:50 PM, Furkan KAMACI wrote: Hi Shawn; You say that: *... your documents are about 50KB each. That would translate to an index that's at least 25GB* I know we can not say an exact size but what is the approximately ratio of document size / index size according to your

Re: How can I set configuration options?

2013-04-09 Thread Chris Hostetter
: Thanks for the replies. The problem I have is that setting them at the JVM : level would mean that all instances of Solr deployed in the Tomcat instance : are forced to use the same settings. I actually want to set the properties : at the application level (e.g. in solr.xml, zoo.conf or maybe an

Re: Field exist in schema.xml but returns

2013-04-09 Thread deniz
Raymond Wiker wrote You have misspelt the tag name in the field definition... you have fiald instead of field. thank you Raymond, it was really hard to find it out in a massive schema file - Zeki ama calismiyor... Calissa yapar... -- View this message in context:

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Shawn Heisey
On 4/9/2013 4:06 PM, Furkan KAMACI wrote: Are there anybody who can help me about how to guess the approximately needed RAM for 5000 query/second at a Solr machine? You've already gotten some good replies, and I'm aware that they haven't really answered your question. This is the kind of

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
These are really good metrics for me: You say that RAM size should be at least index size, and it is better to have a RAM size twice the index size (because of worst case scenario). On the other hand let's assume that I have a RAM size that is bigger than twice of indexes at machine. Can Solr

Re: Results Order When Performing Wildcard Query

2013-04-09 Thread Chris Hostetter
: My gut says the difference in assignment of docids has to do with how the : FileListEntityProcessorhttp://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor docids just represent the order documents are added to the index. if you use DIH with FileListEntityProcessor to create

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread sdspieg
Thanks for those replies. I will look into them. But if anyone knows of a site that describes step by step how a windows user who has already installed solr (and tomcat) can easily feed a folder (and subfolders) with 100s of pdfs into solr, or would be willing to write down down those steps, I

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread sdspieg
I am able to run the java -jar post.jar -help command which I found here: http://docs.lucidworks.com/display/solr/Running+Solr. But now how can I tell post to post all pdf files in a certain folder (preferably recursively) to a collection? Could anybody please post the exact command for that?

Re: Solr 4.2.1 SSLInitializationException

2013-04-09 Thread Sarita Nair
Hi Chris, Thanks for your response. My understanding is that GlassFish specifies the keystore as a system property, but does not specify the password  in order to protect it from snooping. There's a keychain that requires a password to be passed from the DAS in order to unlock the key for the

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread Gora Mohanty
On 10 April 2013 07:28, sdspieg sdsp...@mail.ru wrote: I am able to run the java -jar post.jar -help command which I found here: http://docs.lucidworks.com/display/solr/Running+Solr. But now how can I tell post to post all pdf files in a certain folder (preferably recursively) to a collection?

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread sdspieg
Another progress report. I 'flattened' all the folders which contained the pdf files with Fileboss and then moved the pdf files to the directory where I found the post.jar file (in solr-4.2.1\solr-4.2.1\example\exampledocs). I then ran java -Ddata=files -jar post.jar *.pdf and in the command

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Shawn Heisey
On 4/9/2013 7:03 PM, Furkan KAMACI wrote: These are really good metrics for me: You say that RAM size should be at least index size, and it is better to have a RAM size twice the index size (because of worst case scenario). On the other hand let's assume that I have a RAM size that is

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread Gora Mohanty
On 10 April 2013 08:11, sdspieg sdsp...@mail.ru wrote: Another progress report. I 'flattened' all the folders which contained the pdf files with Fileboss and then moved the pdf files to the directory where I found the post.jar file (in solr-4.2.1\solr-4.2.1\example\exampledocs). I then ran

Re: Pushing a whole set of pdf-files to solr

2013-04-09 Thread Jack Krupansky
The newer SimplePostTool can in fact recurse a directory of PDFs. Just get the usage for the tool. I'm sure it lists the command options. -- Jack Krupansky -Original Message- From: sdspieg Sent: Tuesday, April 09, 2013 9:48 PM To: solr-user@lucene.apache.org Subject: Re: Pushing a

Re: edismax returns very less matches than regular

2013-04-09 Thread Erick Erickson
Adding debugQuery=true is your friend. I suspect that you'll find your first query is actually searching name:coldfusion OR defaultsearchfield:cache and you _think_ it's searching for both coldfusion and cache in the name field Best Erick On Mon, Apr 8, 2013 at 2:50 AM, amit

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Furkan KAMACI
I am sorry but you said: *you need enough free RAM for the OS to cache the maximum amount of disk space all your indexes will ever use* I have made an assumption my indexes at my machine. Let's assume that it is 5 GB. So it is better to have at least 5 GB RAM? OK, Solr will use RAM up to how

Re: Approximately needed RAM for 5000 query/second at a Solr machine?

2013-04-09 Thread Shawn Heisey
On 4/9/2013 9:12 PM, Furkan KAMACI wrote: I am sorry but you said: *you need enough free RAM for the OS to cache the maximum amount of disk space all your indexes will ever use* I have made an assumption my indexes at my machine. Let's assume that it is 5 GB. So it is better to have at

RE: Solr index Backup and restore of large indexs

2013-04-09 Thread Sandeep Kumar Anumalla
Please update? -Original Message- From: Sandeep Kumar Anumalla Sent: 31 March, 2013 12:08 PM To: solr-user@lucene.apache.org Cc: 'Joel Bernstein' Subject: RE: Solr index Backup and restore of large indexs Hi, I am exploring all the possible options. We want to distribute 1 TB traffic

Re: query regarding the use of boost across the fields in edismax query

2013-04-09 Thread Rohan Thakur
hi otis can you explain that in some depth like If is search for led in both the cases what could be the difference in the results I get? thanks in advance regards Rohan On Tue, Apr 9, 2013 at 11:25 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Not sure if i'm missing something but

RE: Solr 4.2 Incremental backups

2013-04-09 Thread Sandeep Kumar Anumalla
HI Erick, My main point is if I use replication I have to use similar kind of setup (Hardware, storage space) as such as the Master, it more cost effective, that is why I am looking at incremental backup options, so that I can keep these backup any place like external Hard disks, tapes. And