Re: indexing rich data from directory using solarium

2015-12-02 Thread Gora Mohanty
On 2 December 2015 at 21:55, kostali hassan wrote: > yes they are a Error in my solr logs: > SolrException URLDecoder: Invalid character encoding detected after > position 79 of query string / form data (while parsing as UTF-8) >

Re: indexing rich data from directory using solarium

2015-12-02 Thread Gora Mohanty
On 2 December 2015 at 21:59, Erik Hatcher wrote: > Gora - > > SimplePostTool actually already adds the literal.id parameter* when in “auto” > mode (and it’s not an XML, JSON, or CSV file). Ah, OK. It has been a while since I actually used the tool. Thanks for the info.

Re: Block Joins

2015-12-02 Thread Rick Leir
Hi Mikhail Sorry, I should have noted "that" is a word in the OCR text that I have indexed. What do I want to achieve? The Block Joins we have been discussing were giving me scores of 0.0, and I would like to get something a wee bit better than that (not looking for accuracy yet). In the query

migrate(or copy) data from one core1(node2) to anothere core2(node1)

2015-12-02 Thread Mugeesh Husain
Hello, I have a 2 solr instance, one is running in solr(non cloud),another one is solrcloud mode. data is indexed in solr mode(non -cloud),now i have creates/define same core with same schema in solrcloud instance. I want to transfer/copy data from one core(non-cloud) to my solrcloud core. On

Re: highlight

2015-12-02 Thread Rick Leir
For performance, if you have many large documents, you want to index the whole document but only store some identifiers. (Maybe this is not a consideration for you, stop reading now ) If you are not storing the whole document, then Solr cannot do the highlighting. You would get an id, then

Method to fix issue when you get KeeperErrorCode = NoAuth when Zookeeper ACL enabled

2015-12-02 Thread Jeff Wu
We have being following this wiki to enable ZooKeeper ACL control https://cwiki.apache.org/confluence/display/solr/ZooKeeper+Access+Control#ZooKeeperAccessControl-AboutZooKeeperACLs It works fine for Solr service itself, but when you try to use scripts/cloud-scripts/zkcli.sh to put a zNode, it

Re: Block Joins

2015-12-02 Thread Mikhail Khludnev
Rick, would you mind to put exact query params, response and let know the expectation? Otherwise, it's hard to get the problem. On Wed, Dec 2, 2015 at 5:44 PM, Rick Leir wrote: > Hi Mikhail > Sorry, I should have noted "that" is a word in the OCR text that I have >

Re: highlight

2015-12-02 Thread Teague James
Hello, Thanks for replying! Yes, I am storing the whole document. The document is indexed with a unique id. There are only 3 fields in the schema - id, rawDocument, tikaDocument. Search uses the tikaDocument field. Against this I am throwing 2-5 word phrases and getting highlighting matches to

Re: indexing rich data from directory using solarium

2015-12-02 Thread Erik Hatcher
Gora - SimplePostTool actually already adds the literal.id parameter* when in “auto” mode (and it’s not an XML, JSON, or CSV file). Erik * See

SOLR-2798 (local params parsing issue) -- how can I help?

2015-12-02 Thread Demian Katz
Hello, I'd really love to see a resolution to SOLR-2798, since my application has a bug that cannot be addressed until this issue is fixed. It occurred to me that there's a good chance that the code involved in this issue is relatively isolated and testable, so I might be able to help with a

Re: indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
the prob with posting using line commande is : I start working in solr 5.3.1 by extract solr in D://solr and run solr server with : D:\solr\solr-5.3.1\bin>solr start ; Then I create a core in standalone mode : D:\solr\solr-5.3.1\bin>solr create -c mycore I need indexing from system files

Re: migrate(or copy) data from one core1(node2) to anothere core2(node1)

2015-12-02 Thread Erick Erickson
Just shut down the SolrCloud instance and copy the index from the non-cloud to cloud directory. Then bring the cloud instance up. You should then be fine. This assumes that your SolrCloud instance has only one shard, which is what I expect given your index size. After that's done, and assuming

Re: indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
yes they are a Error in my solr logs: SolrException URLDecoder: Invalid character encoding detected after position 79 of query string / form data (while parsing as UTF-8)

Re: indexing rich data from directory using solarium

2015-12-02 Thread Gora Mohanty
On 2 December 2015 at 17:16, kostali hassan wrote: > yes its logic Thank you , but i want understand why the same data is > indexing fine in shell using windows SimplePostTool : >> >> D:\solr\solr-5.3.1>java -classpath example\exampledocs\post.jar -Dauto=yes >>

Re: migrate(or copy) data from one core1(node2) to anothere core2(node1)

2015-12-02 Thread Mugeesh Husain
Thanks Erick, I am making join operation for multiple core in solrcloud mode. >>After that's done, and assuming you want to add replicas in the SolrCloud version for HA/DR/Performance reasons, use the ADDREPLICA Collections API command. If i split core into shard then there is any way to use

Re: indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
i fixed but he still a smal prb from time out 30sc of wamp server then i can just put 130files to a directory to index untill i index all my files : this is my function idex document: *App::import('Vendor','autoload',array('file'=>'solarium/vendor/autoload.php'));* *public function

Re: indexing rich data from directory using solarium

2015-12-02 Thread Gora Mohanty
On 2 December 2015 at 22:35, kostali hassan wrote: > i fixed but he still a smal prb from time out 30sc of wamp server then i > can just put 130files to a directory to index untill i index all my files : > this is my function idex document: Again, not familiar with

Solr extract performance

2015-12-02 Thread kostali hassan
I look for optimal way to extract and commit rich data from directory contient many file system masword and pdf because I have a prb with 30second of time out in wamp server. this is my function index document in cakephp using solarium:

Searching and sorting using field aliasing

2015-12-02 Thread Mahmoud Almokadem
Hi all, I have two cores (core1, core2). core1 contains fields(f1, f2, f3, date1) and core2 contains fields(f2, f3, f4, date2). I want to search on the two cores with the date field. Is there an alias to query the two fields on distributed search. For example when q=dateField:NOW perform

Re: migrate(or copy) data from one core1(node2) to anothere core2(node1)

2015-12-02 Thread Erick Erickson
How are you splitting your core to shards? And why? You should only shard when you cannot get reasonable performance on a single shard. To increast the queries-per-second, simply add replicas. And if at all possible, it would be much less error-prone to just re-index your data into a collection

Re: indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
yes I am sure because i successeflly Post the same document(455 .doc .docx and pdf in 18 second) with SimplePostTool But now i want to commincate directly with my server solr using solarium in my application cakephp ; I think only way to have the right encoding is in header : *$headers =

Re: Error on DIH log

2015-12-02 Thread Gora Mohanty
On 27 November 2015 at 11:12, Midas A wrote: > Error: > org.apache.solr.common.SolrException: ERROR: [doc=83629504] Error adding > field 'master_id'='java.math.BigInteger:0' msg=For input string: > "java.math.BigInteger:0" Sorry, was busy the last few days. On a closer

Problems integrating Uima with solr

2015-12-02 Thread vaibhavlella
I followed these steps but was getting warnings Step 1:Setting tags in solrconfig.xml appropriately to point those jar files Step 2:modified solrconfig.xml adding the following snippet and working API keys: VALID_ALCHEMYAPI_KEY

Failed to create collection in Solrcloud

2015-12-02 Thread Mugeesh Husain
Hi, I am using 3 server ,solr1,solr2a sn solr3 I have setup 3 instance of zookeeper in server solr 2 when i try to create 1 shards and 2 replica, it work find. while i am try to create core with 1 shards with 3 replication,using this command bin/solr create -c abc -n abcr -shards 1

Re: Solr Auto-Complete

2015-12-02 Thread Salman Ansari
Sounds good but I heard "/suggest" component is the recommended way of doing auto-complete in the new versions of Solr. Something along the lines of this article https://cwiki.apache.org/confluence/display/solr/Suggester mySuggester FuzzyLookupFactory DocumentDictionaryFactory

Re: Different Similarities for the same field

2015-12-02 Thread Scott Stults
I haven't tried this before (overriding default similarity in a custom SearchComponent), but it looks like it should be possible. In QueryComponent.process() you can get a hold of the SolrIndexSearcher and call setSimilarity(). It also looks like this is set only once by default when the searcher

Grouping by simhash signature

2015-12-02 Thread Nickolay41189
I try to implement NearDup detection by SimHash algorithm in Solr. Let's say: 1) each document has a field /simhash_signature/ that stores a sequence of bits. 2) that in order to be considered NearDup, documents must have, at most, 2 bits

Is it possible to sort on a BooleanField?

2015-12-02 Thread Clemens Wyss DEV
Looks like not. I get to see 'can not sort on a field which is neither indexed nor has doc values: ' - Clemens

Re: Failed to create collection in Solrcloud

2015-12-02 Thread Zheng Lin Edwin Yeo
Hi Mugesh, Which version of Solr and ZooKeeper are you using? i did start this server using this command bin/solr start -cloud -p 8985 -s "example/cloud/node1/solr" -z solr2:2181,solr2:2182,solr3:2183 > Shouldn't the command be "*bin/solr start -cloud -p 8985 -s "example/cloud/node1/solr" -z

Re: Solr Auto-Complete

2015-12-02 Thread Andrea Gazzarini
Hi Salman, few months ago I have been involved in a project similar to map.geoadmin.ch and there, I had your same need (I also sent an email to this list). >From my side I can furtherly confirm what Alan and Alessandro already explained, I followed that approach. IMHO, that is the "recommended

Re: Is it possible to sort on a BooleanField?

2015-12-02 Thread Muhammad Zahid Iqbal
Please share your schema. On Thu, Dec 3, 2015 at 11:28 AM, Clemens Wyss DEV wrote: > Looks like not. I get to see > 'can not sort on a field which is neither indexed nor has doc values: > ' > > - Clemens >

Re: Protect against duplicates with the Migrate statement

2015-12-02 Thread philippa griggs
I used two fields to set up the signature, the unique Id and a time stamp field. As its in test, I set it up- cleared all the data out in both collecionsand reloaded it. I could see the signature which was created. I then migrated into cold collection which already had documents in with the

AW: Is it possible to sort on a BooleanField?

2015-12-02 Thread Clemens Wyss DEV
... ... ... Guess then I must set indexed="true" ;) Is it true the BooleanField may not have docValues? -Ursprüngliche Nachricht- Von: Muhammad Zahid Iqbal [mailto:zahid.iq...@northbaysolutions.net] Gesendet: Donnerstag, 3. Dezember 2015 08:01 An: solr-user Betreff:

Solr Auto-Complete

2015-12-02 Thread Salman Ansari
Hi, I am looking for auto-complete in Solr but on top of just auto complete I want as well to return the data completely (not just suggestions), so I want to get back the ids, and other fields in the whole document. I tried the following 2 approaches but each had issues 1) Used the /suggest

indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
HOW I can indexing from solarium rich data(msword and pdf files) from a dirctory who contient many files, MY config is $config = array( "endpoint" => array("localhost" => array("host"=>"127.0.0.1", "port"=>"8983", "path"=>"/solr", "core"=>"demo",) ) ); I try this code:

Re: indexing rich data from directory using solarium

2015-12-02 Thread Gora Mohanty
On 2 December 2015 at 16:32, kostali hassan wrote: [...] > > When i execute it i get this ERROR: > > org.apache.solr.common.SolrException: URLDecoder: Invalid character > encoding detected after position 79 of query string / form data (while > parsing as UTF-8) Solr

Re: Solr Auto-Complete

2015-12-02 Thread Alan Woodward
Hi Salman, It sounds as though you want to do a normal search against a special 'suggest' field, that's been indexed with edge ngrams. Alan Woodward www.flax.co.uk On 2 Dec 2015, at 09:31, Salman Ansari wrote: > Hi, > > I am looking for auto-complete in Solr but on top of just auto complete

Re: Solr Auto-Complete

2015-12-02 Thread Alessandro Benedetti
Hi Salman, I agree with Alan. Just configure your schema with the proper analysers . For the field you want to use for suggestions you are likely to need simply this fieldType : This is a

Protect against duplicates with the Migrate statement

2015-12-02 Thread philippa griggs
Hello, I'm using Solr 5.2.1 and Zookeeper 3.4.6. I'm implementing two collections - HotDocuments and ColdDocuments . New documents will only be written to HotDocuments and every night I will migrate a chunk of documents into ColdDocuments. In the test environment, I have the Collection API

UpdateLogs in HDFS

2015-12-02 Thread Alan Woodward
Hi all, As a step in SOLR-8282, I'm trying to get all access to the data directory done by Solr to be mediated through the DirectoryFactory implementation. Part of this is the creation of the UpdateLog, and I'm a bit confused by some of the logic in there currently. The UpdateLog is created

Re: Create Collection Admin Request - unable to specify collection configName

2015-12-02 Thread Kelly, Frank
Thank you everyone - this was EXACTLY my problem. I was using a chroot for startup but not on the upload of configurations. Now everything works as expected. Thanks everyone! -Frank On 12/2/15, 12:10 AM, "Upayavira" wrote: >Adding /solr to the zk string 'namespaces' the

Re: indexing rich data from directory using solarium

2015-12-02 Thread kostali hassan
yes its logic Thank you , but i want understand why the same data is indexing fine in shell using windows SimplePostTool : > > D:\solr\solr-5.3.1>java -classpath example\exampledocs\post.jar -Dauto=yes > -Dc=solr_docs_core -Ddata=files -Drecursive=yes > org.apache.solr.util.SimplePostTool

Re: Protect against duplicates with the Migrate statement

2015-12-02 Thread Zheng Lin Edwin Yeo
Hi Philippa, Which field did you use to set it as SignatureField in your ColdDocuments when you implement the de-duplication? Regards, Edwin On 2 December 2015 at 18:59, philippa griggs wrote: > Hello, > > > I'm using Solr 5.2.1 and Zookeeper 3.4.6. > > > I'm