High CPU Usage in export handler

2016-11-02 Thread Ray Niu
Hello:
   We are using export handler in Solr Cloud to get some data, we only
request for one field, which type is tdouble, it works well at the
beginning, but recently we saw high CPU issue in all the solr cloud nodes,
we took some thread dump and found following information:

   java.lang.Thread.State: RUNNABLE

at java.lang.Thread.isAlive(Native Method)

at
org.apache.lucene.util.CloseableThreadLocal.purge(CloseableThreadLocal.java:115)

- locked <0x0006e24d86a8> (a java.util.WeakHashMap)

at
org.apache.lucene.util.CloseableThreadLocal.maybePurge(CloseableThreadLocal.java:105)

at
org.apache.lucene.util.CloseableThreadLocal.get(CloseableThreadLocal.java:88)

at
org.apache.lucene.index.CodecReader.getNumericDocValues(CodecReader.java:143)

at
org.apache.lucene.index.FilterLeafReader.getNumericDocValues(FilterLeafReader.java:430)

at
org.apache.lucene.uninverting.UninvertingReader.getNumericDocValues(UninvertingReader.java:239)

at
org.apache.lucene.index.FilterLeafReader.getNumericDocValues(FilterLeafReader.java:430)

Is this a known issue for export handler? As we only fetch up to 5000
documents, it should not be data volume issue.

Can anyone help on that? Thanks a lot.


Re: Timeout occured while waiting response from server at: http://***/solr/commodityReview

2016-11-02 Thread Erick Erickson
It really sounds like you're re-inventing SolrCloud, but
you know your requirements best.

Erick

On Wed, Nov 2, 2016 at 8:48 PM, Kent Mu  wrote:
> Thanks Erick!
> Actually, similar to solrcloud, we split our data to 8 customized shards(1
> master with 4 slaves), and each with one ctrix and two apache web server to
> reduce server pressure through load balancing.
> As we are running an e-commerce site, the number of reviews of selling
> products grows very fast, we get the modulus on product code to put the
> reviews in the proper customized solr shard, so that we can relatively
> reduce the index size on each solr.
> we will first try to upgrade the physical memory, and let's see what it
> will happen. if the query performance is not ideal, we will try to deploy
> solr in physical machine, or we can use SSD instead.
>
> “Rome was not built in a day”, so we can explore it step by step.
> Ha ha...
> Best Regards!
> Kent
>
> 2016-11-03 1:10 GMT+08:00 Erick Erickson :
>
>> You need to move to SolrCloud when it's
>> time to shard ;).
>>
>> More seriously, at some point simply adding more
>> memory will not be adequate. Either your JVM
>> heap will to grow to a point where you start encountering
>> GC pauses or the time to serve requests will
>> increase unacceptably. "when?" you ask? well
>> unfortunately there are no guidelines that can be
>> guaranteed, here's a long blog on the subject:
>>
>> https://lucidworks.com/blog/sizing-hardware-in-the-
>> abstract-why-we-dont-have-a-definitive-answer/
>>
>> The short form is you need to stress-test your
>> index and query patterns.
>>
>> Now, I've seen 20M docs strain a 32G Java heap. I've
>> seen 300M docs give very nice response times with
>> 12G of memory. It Depends (tm).
>>
>> Whether to put Solr on bare metal or not: There's
>> inevitably some penalty for a VM. That said there are lots
>> of places that use VMs successfully. Again, stress
>> testing is the key.
>>
>> And finally, using docValues for any field that sorts,
>> facets or groups will reduce the JVM requirements
>> significantly, albeit by using OS memory space, see
>> Uwe's excellent blog:
>>
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>>
>> Best,
>> Erick
>>
>> On Tue, Nov 1, 2016 at 10:23 PM, Kent Mu  wrote:
>> > Thanks, I got it, Erick!
>> >
>> > the size of our index data is more than 30GB every year now, and it is
>> > still growing up, and actually our solr now is running on a virtual
>> > machine. so I wonder if we need to deploy solr in a physical machine, or
>> I
>> > can just upgrade the physical memory of our Virtual machines?
>> >
>> > Best,
>> > Kent
>> >
>> > 2016-11-02 11:33 GMT+08:00 Erick Erickson :
>> >
>> >> Kent: OK, I see now. Then a minor pedantic point...
>> >>
>> >> It'll avoid confusion if you use master and slaves
>> >> rather than master and replicas when talking about
>> >> non-cloud setups.
>> >>
>> >> The equivalent in SolrCloud is leader and replicas.
>> >>
>> >> No big deal either way, just FYI.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Tue, Nov 1, 2016 at 8:09 PM, Kent Mu  wrote:
>> >> > Thanks a lot for your reply, Shawn!
>> >> >
>> >> > no other applications on the server, I agree with you that we need to
>> >> > upgrade physical memory, and allocate the reasonable jvm size, so that
>> >> the
>> >> > operating system have spare memory available to cache the index.
>> >> >
>> >> > actually, we have nearly 100 million of data every year now, and it is
>> >> > still growing, and actually our solr now is running on a virtual
>> machine.
>> >> > so I wonder if we need to deploy solr in a physical machine.
>> >> >
>> >> > Best Regards!
>> >> > Kent
>> >> >
>> >> > 2016-11-01 21:18 GMT+08:00 Shawn Heisey :
>> >> >
>> >> >> On 11/1/2016 1:07 AM, Kent Mu wrote:
>> >> >> > Hi friends! We come across an issue when we use the solrj(4.9.1) to
>> >> >> > connect to solr server, our deployment is one master with 10
>> replicas.
>> >> >> > we index data to the master, and search data from the replicas via
>> >> >> > load balancing. the error stack is as below: *Timeout occured while
>> >> >> > waiting response from server at:
>> >> >> > http://review.solrsearch3.cnsuning.com/solr/commodityReview
>> >> >> > *
>> >> >> > org.apache.solr.client.solrj.SolrServerException: Timeout occured
>> >> >> > while waiting response from server at:
>> >> >>
>> >> >> This shows that you are connecting to port 80.  It is relatively
>> rare to
>> >> >> run Solr on port 80, though it is possible.  Do you have an
>> intermediate
>> >> >> layer, like a proxy or a load balancer?  If so, you'll need to ensure
>> >> >> that there's not a problem there.  If it works normally when
>> replication
>> >> >> isn't happening, that's probably not a worry.
>> >> >>
>> >> 

Re: Timeout occured while waiting response from server at: http://***/solr/commodityReview

2016-11-02 Thread Kent Mu
Thanks Erick!
Actually, similar to solrcloud, we split our data to 8 customized shards(1
master with 4 slaves), and each with one ctrix and two apache web server to
reduce server pressure through load balancing.
As we are running an e-commerce site, the number of reviews of selling
products grows very fast, we get the modulus on product code to put the
reviews in the proper customized solr shard, so that we can relatively
reduce the index size on each solr.
we will first try to upgrade the physical memory, and let's see what it
will happen. if the query performance is not ideal, we will try to deploy
solr in physical machine, or we can use SSD instead.

“Rome was not built in a day”, so we can explore it step by step.
Ha ha...
Best Regards!
Kent

2016-11-03 1:10 GMT+08:00 Erick Erickson :

> You need to move to SolrCloud when it's
> time to shard ;).
>
> More seriously, at some point simply adding more
> memory will not be adequate. Either your JVM
> heap will to grow to a point where you start encountering
> GC pauses or the time to serve requests will
> increase unacceptably. "when?" you ask? well
> unfortunately there are no guidelines that can be
> guaranteed, here's a long blog on the subject:
>
> https://lucidworks.com/blog/sizing-hardware-in-the-
> abstract-why-we-dont-have-a-definitive-answer/
>
> The short form is you need to stress-test your
> index and query patterns.
>
> Now, I've seen 20M docs strain a 32G Java heap. I've
> seen 300M docs give very nice response times with
> 12G of memory. It Depends (tm).
>
> Whether to put Solr on bare metal or not: There's
> inevitably some penalty for a VM. That said there are lots
> of places that use VMs successfully. Again, stress
> testing is the key.
>
> And finally, using docValues for any field that sorts,
> facets or groups will reduce the JVM requirements
> significantly, albeit by using OS memory space, see
> Uwe's excellent blog:
>
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Best,
> Erick
>
> On Tue, Nov 1, 2016 at 10:23 PM, Kent Mu  wrote:
> > Thanks, I got it, Erick!
> >
> > the size of our index data is more than 30GB every year now, and it is
> > still growing up, and actually our solr now is running on a virtual
> > machine. so I wonder if we need to deploy solr in a physical machine, or
> I
> > can just upgrade the physical memory of our Virtual machines?
> >
> > Best,
> > Kent
> >
> > 2016-11-02 11:33 GMT+08:00 Erick Erickson :
> >
> >> Kent: OK, I see now. Then a minor pedantic point...
> >>
> >> It'll avoid confusion if you use master and slaves
> >> rather than master and replicas when talking about
> >> non-cloud setups.
> >>
> >> The equivalent in SolrCloud is leader and replicas.
> >>
> >> No big deal either way, just FYI.
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Nov 1, 2016 at 8:09 PM, Kent Mu  wrote:
> >> > Thanks a lot for your reply, Shawn!
> >> >
> >> > no other applications on the server, I agree with you that we need to
> >> > upgrade physical memory, and allocate the reasonable jvm size, so that
> >> the
> >> > operating system have spare memory available to cache the index.
> >> >
> >> > actually, we have nearly 100 million of data every year now, and it is
> >> > still growing, and actually our solr now is running on a virtual
> machine.
> >> > so I wonder if we need to deploy solr in a physical machine.
> >> >
> >> > Best Regards!
> >> > Kent
> >> >
> >> > 2016-11-01 21:18 GMT+08:00 Shawn Heisey :
> >> >
> >> >> On 11/1/2016 1:07 AM, Kent Mu wrote:
> >> >> > Hi friends! We come across an issue when we use the solrj(4.9.1) to
> >> >> > connect to solr server, our deployment is one master with 10
> replicas.
> >> >> > we index data to the master, and search data from the replicas via
> >> >> > load balancing. the error stack is as below: *Timeout occured while
> >> >> > waiting response from server at:
> >> >> > http://review.solrsearch3.cnsuning.com/solr/commodityReview
> >> >> > *
> >> >> > org.apache.solr.client.solrj.SolrServerException: Timeout occured
> >> >> > while waiting response from server at:
> >> >>
> >> >> This shows that you are connecting to port 80.  It is relatively
> rare to
> >> >> run Solr on port 80, though it is possible.  Do you have an
> intermediate
> >> >> layer, like a proxy or a load balancer?  If so, you'll need to ensure
> >> >> that there's not a problem there.  If it works normally when
> replication
> >> >> isn't happening, that's probably not a worry.
> >> >>
> >> >> > It takes place not often. after analysis, we find that only when
> the
> >> >> > replicas Synchronous Data from master solr server. it seem that
> when
> >> >> > the replicas block search requests when synchronizing data from
> >> >> > master, is that true?
> >> >>
> >> >> Solr should be able to 

Re: Sorting Problem with custom ValueSourceParser

2016-11-02 Thread Tirthankar
@Rohit you can look into this.
http://www.javaworld.com/article/2074996/hashcode-and-equals-method-in-java-object---a-pragmatic-concept.html

A good article for hashcode and equals




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-Problem-with-custom-ValueSourceParser-tp4299571p4304279.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Poor Solr Cloud Query Performance against a Small Dataset

2016-11-02 Thread Rick Leir
Here is a wild guess. Whenever I see a 5 second delay in networking, I 
think DNS timeouts. YMMV, good luck.


cheers -- Rick

On 2016-11-01 04:18 PM, Dave Seltzer wrote:

Hello!

I'm trying to utilize Solr Cloud to help with a hash search problem. The
record set has only 4,300 documents.

When I run my search against a single core I get results on the order of
10ms. When I run the same search against Solr Cloud results take about
5,000 ms.

Is there something about this particular query which makes it perform
poorly in a Cloud environment? The query looks like this (linebreaks added
for readability):

{!frange+l%3D5+u%3D25}sum(
 termfreq(hashTable_0,'225706351'),
 termfreq(hashTable_1,'17664000'),
 termfreq(hashTable_2,'86447642'),
 termfreq(hashTable_3,'134816033'),




Re: Unit Tests failing

2016-11-02 Thread Shawn Heisey
On 11/2/2016 4:00 PM, Rishabh Patel wrote:
> Hello, I downloaded a tar, tagged as 6.2.1 from Github. Upon doing
> "ant test" from lucene, all tests pass, however they repeatedly fail
> from the Solr folder. The tests are running on a Ubuntu 14.04 box with
> Java 8. These are the truncated statements from the output: 

> [junit4] FAILURE 0.08s J3 | TestCoreDiscovery.testSolrHomeNotReadable
> <<< [junit4] > Throwable #1: java.lang.AssertionError: Should have
> thrown an exception here

> Any inputs on how to fix this?

Are you running the tests as root?  That particular test will probably
always fail if you run as root, because root is generally able to
read/write to anything, even if the test takes steps to limit
permissions.  The assertion message probably should mention this.  Run
tests as a regular user.

Note to self and other devs: the test class uses File extensively.  Look
into whether or not it can be updated to NIO2.

Thanks,
Shawn



Re: Problem with Password Decryption in Data Import Handler

2016-11-02 Thread Fuad Efendi
Then I can only guess that in current configuration decrypted password is empty 
string.

Try to manually replace some characters in encpwd.txt file to see if you get 
different errors; try to delete this file completely to see if you get 
different errors. Try to add new line in this file; try to change password in 
config file.



On November 2, 2016 at 5:23:33 PM, Jamie Jackson (jamieja...@gmail.com) wrote:

I should have mentioned that I verified connectivity with plain passwords:  

From the same machine that Solr's running on:  

solr@000650cbdd5e:/opt/solr$ mysql -uroot -pOakton153 -h local.mysite.com  
mysite -e "select 'foo' as bar;"  
+-+  
| bar |  
+-+  
| foo |  
+-+  

Also, if I add the plain-text password to the config, it connects fine:  

  


So that is why I claim to have a problem with encryptKeyFile, specifically,  
because I've eliminated general connectivity/authentication problems.  

Thanks,  
Jamie  

On Wed, Nov 2, 2016 at 4:58 PM, Fuad Efendi  wrote:  

> In MySQL, this command will explicitly allow to connect from  
> remote ICZ2002912 host, check MySQL documentation:  
>  
> GRANT ALL ON mysite.* TO 'root’@'ICZ2002912' IDENTIFIED BY ‘Oakton123’;  
>  
>  
>  
> On November 2, 2016 at 4:41:48 PM, Fuad Efendi (f...@efendi.ca) wrote:  
>  
> This is the root of the problem:  
> "Access denied for user 'root'@'ICZ2002912' (using password: NO) “  
>  
>  
> First of all, ensure that plain (non-encrypted) password settings work for  
> you.  
>  
> Check that you can connect using MySQL client from ICZ2002912 to your  
> MySQL & Co. instance  
>  
> I suspect you need to allow MySQL & Co. to accept connections  
> from ICZ2002912. Plus, check DNS resolution, etc.  
>  
>  
> Thanks,  
>  
>  
> --  
> Fuad Efendi  
> (416) 993-2060  
> http://www.tokenizer.ca  
> Recommender Systems  
>  
>  
> On November 2, 2016 at 2:37:08 PM, Jamie Jackson (jamieja...@gmail.com)  
> wrote:  
>  
> I'm at a brick wall. Here's the latest status:  
>  
> Here are some sample commands that I'm using:  
>  
> *Create the encryptKeyFile and encrypted password:*  
>  
>  
> encrypter_password='this_is_my_encrypter_password'  
> plain_db_pw='Oakton153'  
>  
> cd /var/docker/solr_stage2/credentials/  
> echo -n "${encrypter_password}" > encpwd.txt  
> echo -n "${plain_db_pwd}" > plaindbpwd.txt  
> openssl enc -aes-128-cbc -a -salt -in plaindbpwd.txt -k  
> "${encrypter_password}"  
>  
> rm plaindbpwd.txt  
>  
> That generated this as the password, by the way:  
>  
> U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o=  
>  
> *Configure DIH configuration:*  
>  
>   
>  
>  driver="org.mariadb.jdbc.Driver"  
> url="jdbc:mysql://local.mysite.com:3306/mysite"  
> user="root"  
> password="U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o="  
> encryptKeyFile="/opt/solr/credentials/encpwd.txt"  
> />  
> ...  
>  
>  
> By the way, /var/docker/solr_stage2/credentials/ is mapped to  
> /opt/solr/credentials/ in the docker container, so that's why the paths  
> *seem* different (but aren't, really).  
>  
>  
> *Authentication error when data import is run:*  
>  
> Exception while processing: question document :  
> SolrInputDocument(fields:  
> []):org.apache.solr.handler.dataimport.DataImportHandlerException:  
> Unable to execute query: select 'foo' as bar; Processing  
> Document # 1  
> at org.apache.solr.handler.dataimport.DataImportHandlerException.  
> wrapAndThrow(DataImportHandlerException.java:69)  
> at org.apache.solr.handler.dataimport.JdbcDataSource$  
> ResultSetIterator.(JdbcDataSource.java:323)  
> at org.apache.solr.handler.dataimport.JdbcDataSource.  
> getData(JdbcDataSource.java:283)  
> at org.apache.solr.handler.dataimport.JdbcDataSource.  
> getData(JdbcDataSource.java:52)  
> at org.apache.solr.handler.dataimport.SqlEntityProcessor.  
> initQuery(SqlEntityProcessor.java:59)  
> at org.apache.solr.handler.dataimport.SqlEntityProcessor.  
> nextRow(SqlEntityProcessor.java:73)  
> at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(  
> EntityProcessorWrapper.java:244)  
> at org.apache.solr.handler.dataimport.DocBuilder.  
> buildDocument(DocBuilder.java:475)  
> at org.apache.solr.handler.dataimport.DocBuilder.  
> buildDocument(DocBuilder.java:414)  
> at org.apache.solr.handler.dataimport.DocBuilder.  
> doFullDump(DocBuilder.java:329)  
> at org.apache.solr.handler.dataimport.DocBuilder.execute(  
> DocBuilder.java:232)  
> at org.apache.solr.handler.dataimport.DataImporter.  
> doFullImport(DataImporter.java:416)  
> at org.apache.solr.handler.dataimport.DataImporter.  
> runCmd(DataImporter.java:480)  
> at org.apache.solr.handler.dataimport.DataImporter$1.run(  
> DataImporter.java:461)  
> Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not  
> connect: Access denied for user 'root'@'ICZ2002912' (using password:  
> NO)  
> at org.mariadb.jdbc.internal.util.ExceptionMapper.get(  
> ExceptionMapper.java:123)  
> at 

Unit Tests failing

2016-11-02 Thread Rishabh Patel
Hello,

I downloaded a tar, tagged as 6.2.1 from Github. Upon doing "ant test" from
lucene, all tests pass, however they repeatedly fail from the Solr folder.
The tests are running on a Ubuntu 14.04 box with Java 8.

These are the truncated statements from the output:

   [junit4]   2> 685308 INFO
 (TEST-TestCoreDiscovery.testSolrHomeNotReadable-seed#[3D48BA4AD5FF1183]) [
   x:core2] o.a.s.SolrTestCaseJ4 ###Ending testSolrHomeNotReadable
   [junit4]   2> NOTE: reproduce with: ant test
 -Dtestcase=TestCoreDiscovery -Dtests.method=testSolrHomeNotReadable
-Dtests.seed=3D48BA4AD5FF1183 -Dtests.slow=true -Dtests.locale=lv
-Dtests.timezone=America/Cayenne -Dtests.asserts=true
-Dtests.file.encoding=ISO-8859-1
   [junit4] FAILURE 0.08s J3 | TestCoreDiscovery.testSolrHomeNotReadable <<<
   [junit4]> Throwable #1: java.lang.AssertionError: Should have thrown
an exception here
   [junit4]> at
__randomizedtesting.SeedInfo.seed([3D48BA4AD5FF1183:712880B16D407DD1]:0)
   [junit4]> at
org.apache.solr.core.TestCoreDiscovery.testSolrHomeNotReadable(TestCoreDiscovery.java:428)
   [junit4]> at java.lang.Thread.run(Thread.java:745)

 [junit4]   2> NOTE: All tests run in this JVM:
[SimpleCollectionCreateDeleteTest, ClusterStateUpdateTest, RankQueryTest,
DeleteNodeTest, TermVectorComponentTest, SuggesterTSTTest,
ZkControllerTest, FastVectorHighlighterTest, TestTestInjection,
UniqFieldsUpdateProcessorFactoryTest, DirectSolrSpellCheckerTest,
ZkStateWriterTest, TestMissingGroups, TestLRUCache,
TestNonDefinedSimilarityFactory, RulesTest, TestBinaryField,
SolrIndexConfigTest, LeaderInitiatedRecoveryOnCommitTest,
LukeRequestHandlerTest, OpenExchangeRatesOrgProviderTest,
HdfsNNFailoverTest, TestSolrConfigHandlerCloud, HighlighterConfigTest,
TestDocBasedVersionConstraints, XsltUpdateRequestHandlerTest,
TestSuggestSpellingConverter, TestDefaultStatsCache,
TestWordDelimiterFilterFactory, TestIBSimilarityFactory,
TestExclusionRuleCollectionAccess, SampleTest,
TestDynamicFieldCollectionResource, AutoCommitTest,
TestDistributedGrouping, EchoParamsTest, TestRandomFaceting,
HdfsChaosMonkeySafeLeaderTest, TestMacros, RollingRestartTest,
TestXIncludeConfig, TestFoldingMultitermQuery, TestReplicationHandler,
ZkCLITest, TestDistribDocBasedVersion, TestRTimerTree,
BlockJoinFacetDistribTest, SpellCheckCollatorTest,
DistributedFacetPivotLargeTest, TestFieldTypeResource,
TestSimpleQParserPlugin, TestRandomRequestDistribution, TestConfigSets,
TestCoreDiscovery]
   [junit4] Completed [245/629 (1!)] on J3 in 1.66s, 12 tests, 2 failures
<<< FAILURES!

BUILD FAILED
/root/solr/solr/build.xml:233: The following error occurred while executing
this line:
/root/solr/solr/common-build.xml:536: The following error occurred while
executing this line:
/root/solr/lucene/common-build.xml:1443: The following error occurred while
executing this line:
/root/solr/lucene/common-build.xml:984: There were test failures: 629
suites (10 ignored), 2679 tests, 2 failures, 85 ignored (70 assumptions)
[seed: 3D48BA4AD5FF1183]
at com.carrotsearch.ant.tasks.junit4.JUnit4.execute(JUnit4.java:1024)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)



Any inputs on how to fix this?

-- 
Regards,
*Rishabh Patel*


Re: Problem with Password Decryption in Data Import Handler

2016-11-02 Thread Jamie Jackson
I should have mentioned that I verified connectivity with plain passwords:

>From the same machine that Solr's running on:

solr@000650cbdd5e:/opt/solr$ mysql -uroot -pOakton153 -h local.mysite.com
mysite -e "select 'foo' as bar;"
+-+
| bar |
+-+
| foo |
+-+

Also, if I add the plain-text password to the config, it connects fine:




So that is why I claim to have a problem with encryptKeyFile, specifically,
because I've eliminated general connectivity/authentication problems.

Thanks,
Jamie

On Wed, Nov 2, 2016 at 4:58 PM, Fuad Efendi  wrote:

> In MySQL, this command will explicitly allow to connect from
> remote ICZ2002912 host, check MySQL documentation:
>
> GRANT ALL ON mysite.* TO 'root’@'ICZ2002912' IDENTIFIED BY ‘Oakton123’;
>
>
>
> On November 2, 2016 at 4:41:48 PM, Fuad Efendi (f...@efendi.ca) wrote:
>
> This is the root of the problem:
> "Access denied for user 'root'@'ICZ2002912' (using password: NO) “
>
>
> First of all, ensure that plain (non-encrypted) password settings work for
> you.
>
> Check that you can connect using MySQL client from ICZ2002912 to your
> MySQL & Co. instance
>
> I suspect you need to allow MySQL & Co. to accept connections
> from ICZ2002912. Plus, check DNS resolution, etc.
>
>
> Thanks,
>
>
> --
> Fuad Efendi
> (416) 993-2060
> http://www.tokenizer.ca
> Recommender Systems
>
>
> On November 2, 2016 at 2:37:08 PM, Jamie Jackson (jamieja...@gmail.com)
> wrote:
>
> I'm at a brick wall. Here's the latest status:
>
> Here are some sample commands that I'm using:
>
> *Create the encryptKeyFile and encrypted password:*
>
>
> encrypter_password='this_is_my_encrypter_password'
> plain_db_pw='Oakton153'
>
> cd /var/docker/solr_stage2/credentials/
> echo -n "${encrypter_password}" > encpwd.txt
> echo -n "${plain_db_pwd}" > plaindbpwd.txt
> openssl enc -aes-128-cbc -a -salt -in plaindbpwd.txt -k
> "${encrypter_password}"
>
> rm plaindbpwd.txt
>
> That generated this as the password, by the way:
>
> U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o=
>
> *Configure DIH configuration:*
>
> 
>
>  driver="org.mariadb.jdbc.Driver"
> url="jdbc:mysql://local.mysite.com:3306/mysite"
> user="root"
> password="U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o="
> encryptKeyFile="/opt/solr/credentials/encpwd.txt"
> />
> ...
>
>
> By the way, /var/docker/solr_stage2/credentials/ is mapped to
> /opt/solr/credentials/ in the docker container, so that's why the paths
> *seem* different (but aren't, really).
>
>
> *Authentication error when data import is run:*
>
> Exception while processing: question document :
> SolrInputDocument(fields:
> []):org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable to execute query: select 'foo' as bar; Processing
> Document # 1
> at org.apache.solr.handler.dataimport.DataImportHandlerException.
> wrapAndThrow(DataImportHandlerException.java:69)
> at org.apache.solr.handler.dataimport.JdbcDataSource$
> ResultSetIterator.(JdbcDataSource.java:323)
> at org.apache.solr.handler.dataimport.JdbcDataSource.
> getData(JdbcDataSource.java:283)
> at org.apache.solr.handler.dataimport.JdbcDataSource.
> getData(JdbcDataSource.java:52)
> at org.apache.solr.handler.dataimport.SqlEntityProcessor.
> initQuery(SqlEntityProcessor.java:59)
> at org.apache.solr.handler.dataimport.SqlEntityProcessor.
> nextRow(SqlEntityProcessor.java:73)
> at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(
> EntityProcessorWrapper.java:244)
> at org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:475)
> at org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:414)
> at org.apache.solr.handler.dataimport.DocBuilder.
> doFullDump(DocBuilder.java:329)
> at org.apache.solr.handler.dataimport.DocBuilder.execute(
> DocBuilder.java:232)
> at org.apache.solr.handler.dataimport.DataImporter.
> doFullImport(DataImporter.java:416)
> at org.apache.solr.handler.dataimport.DataImporter.
> runCmd(DataImporter.java:480)
> at org.apache.solr.handler.dataimport.DataImporter$1.run(
> DataImporter.java:461)
> Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not
> connect: Access denied for user 'root'@'ICZ2002912' (using password:
> NO)
> at org.mariadb.jdbc.internal.util.ExceptionMapper.get(
> ExceptionMapper.java:123)
> at org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(
> ExceptionMapper.java:71)
> at org.mariadb.jdbc.Driver.connect(Driver.java:109)
> at org.apache.solr.handler.dataimport.JdbcDataSource$1.
> call(JdbcDataSource.java:192)
> at org.apache.solr.handler.dataimport.JdbcDataSource$1.
> call(JdbcDataSource.java:172)
> at org.apache.solr.handler.dataimport.JdbcDataSource.
> getConnection(JdbcDataSource.java:503)
> at org.apache.solr.handler.dataimport.JdbcDataSource$
> ResultSetIterator.(JdbcDataSource.java:313)
> ... 12 more
> Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could
> not connect: Access denied for user 'root'@'ICZ2002912' (using
> password: NO)
> 

Re: Problem with Password Decryption in Data Import Handler

2016-11-02 Thread Fuad Efendi
In MySQL, this command will explicitly allow to connect from remote ICZ2002912 
host, check MySQL documentation:

GRANT ALL ON mysite.* TO 'root’@'ICZ2002912' IDENTIFIED BY ‘Oakton123’;



On November 2, 2016 at 4:41:48 PM, Fuad Efendi (f...@efendi.ca) wrote:

This is the root of the problem:
"Access denied for user 'root'@'ICZ2002912' (using password: NO) “


First of all, ensure that plain (non-encrypted) password settings work for you.

Check that you can connect using MySQL client from ICZ2002912 to your MySQL & 
Co. instance

I suspect you need to allow MySQL & Co. to accept connections from ICZ2002912. 
Plus, check DNS resolution, etc. 


Thanks,


--
Fuad Efendi
(416) 993-2060
http://www.tokenizer.ca
Recommender Systems


On November 2, 2016 at 2:37:08 PM, Jamie Jackson (jamieja...@gmail.com) wrote:

I'm at a brick wall. Here's the latest status:

Here are some sample commands that I'm using:

*Create the encryptKeyFile and encrypted password:*


encrypter_password='this_is_my_encrypter_password'
plain_db_pw='Oakton153'

cd /var/docker/solr_stage2/credentials/
echo -n "${encrypter_password}" > encpwd.txt
echo -n "${plain_db_pwd}" > plaindbpwd.txt
openssl enc -aes-128-cbc -a -salt -in plaindbpwd.txt -k
"${encrypter_password}"

rm plaindbpwd.txt

That generated this as the password, by the way:

U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o=

*Configure DIH configuration:*




...


By the way, /var/docker/solr_stage2/credentials/ is mapped to
/opt/solr/credentials/ in the docker container, so that's why the paths
*seem* different (but aren't, really).


*Authentication error when data import is run:*

Exception while processing: question document :
SolrInputDocument(fields:
[]):org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: select 'foo' as bar; Processing
Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:323)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not
connect: Access denied for user 'root'@'ICZ2002912' (using password:
NO)
at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:123)
at 
org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(ExceptionMapper.java:71)
at org.mariadb.jdbc.Driver.connect(Driver.java:109)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:192)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:172)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSource.java:503)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:313)
... 12 more
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could
not connect: Access denied for user 'root'@'ICZ2002912' (using
password: NO)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:524)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:472)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:374)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:763)
at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:469)
at org.mariadb.jdbc.Driver.connect(Driver.java:104)
... 16 more



On Thu, Oct 6, 2016 at 2:42 PM, Jamie Jackson  wrote:

> It happens to be ten characters.
>
> On Thu, Oct 6, 2016 at 12:44 PM, Alexandre Rafalovitch  > wrote:
>
>> How long is the encryption key (file content)? Because the code I am
>> looking at seems to expect it to be at most 100 characters.
>>
>> Regards,
>> Alex.
>> 
>> Newsletter and resources for Solr beginners and 

Re: Problem with Password Decryption in Data Import Handler

2016-11-02 Thread Fuad Efendi
This is the root of the problem:
"Access denied for user 'root'@'ICZ2002912' (using password: NO) “


First of all, ensure that plain (non-encrypted) password settings work for you.

Check that you can connect using MySQL client from ICZ2002912 to your MySQL & 
Co. instance

I suspect you need to allow MySQL & Co. to accept connections from ICZ2002912. 
Plus, check DNS resolution, etc. 


Thanks,


--
Fuad Efendi
(416) 993-2060
http://www.tokenizer.ca
Recommender Systems


On November 2, 2016 at 2:37:08 PM, Jamie Jackson (jamieja...@gmail.com) wrote:

I'm at a brick wall. Here's the latest status:  

Here are some sample commands that I'm using:  

*Create the encryptKeyFile and encrypted password:*  


encrypter_password='this_is_my_encrypter_password'  
plain_db_pw='Oakton153'  

cd /var/docker/solr_stage2/credentials/  
echo -n "${encrypter_password}" > encpwd.txt  
echo -n "${plain_db_pwd}" > plaindbpwd.txt  
openssl enc -aes-128-cbc -a -salt -in plaindbpwd.txt -k  
"${encrypter_password}"  

rm plaindbpwd.txt  

That generated this as the password, by the way:  

U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o=  

*Configure DIH configuration:*  

  

  
...  


By the way, /var/docker/solr_stage2/credentials/ is mapped to  
/opt/solr/credentials/ in the docker container, so that's why the paths  
*seem* different (but aren't, really).  


*Authentication error when data import is run:*  

Exception while processing: question document :  
SolrInputDocument(fields:  
[]):org.apache.solr.handler.dataimport.DataImportHandlerException:  
Unable to execute query: select 'foo' as bar; Processing  
Document # 1  
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:323)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
  
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
  
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
  
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
  
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
  
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
  
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)  
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)  
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
  
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)  
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) 
 
Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not  
connect: Access denied for user 'root'@'ICZ2002912' (using password:  
NO)  
at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:123) 
 
at 
org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(ExceptionMapper.java:71)
  
at org.mariadb.jdbc.Driver.connect(Driver.java:109)  
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:192)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:172)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSource.java:503)
  
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:313)
  
... 12 more  
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could  
not connect: Access denied for user 'root'@'ICZ2002912' (using  
password: NO)  
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:524)
  
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:472)
  
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:374)
  
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:763)
  
at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:469)  
at org.mariadb.jdbc.Driver.connect(Driver.java:104)  
... 16 more  



On Thu, Oct 6, 2016 at 2:42 PM, Jamie Jackson  wrote:  

> It happens to be ten characters.  
>  
> On Thu, Oct 6, 2016 at 12:44 PM, Alexandre Rafalovitch  > wrote:  
>  
>> How long is the encryption key (file content)? Because the code I am  
>> looking at seems to expect it to be at most 100 characters.  
>>  
>> Regards,  
>> Alex.  
>>   
>> Newsletter and resources for Solr beginners and intermediates:  
>> http://www.solr-start.com/  
>>  
>>  
>> On 6 October 2016 at 23:26, 

Re: Problem with Password Decryption in Data Import Handler

2016-11-02 Thread Jamie Jackson
I'm at a brick wall. Here's the latest status:

Here are some sample commands that I'm using:

*Create the encryptKeyFile and encrypted password:*


encrypter_password='this_is_my_encrypter_password'
plain_db_pw='Oakton153'

cd /var/docker/solr_stage2/credentials/
echo -n "${encrypter_password}" > encpwd.txt
echo -n "${plain_db_pwd}" > plaindbpwd.txt
openssl enc -aes-128-cbc -a -salt -in plaindbpwd.txt -k
"${encrypter_password}"

rm plaindbpwd.txt

That generated this as the password, by the way:

U2FsdGVkX19pBVTeZaSl43gFFAlrx+Th1zSg1GvlX9o=

*Configure DIH configuration:*




...


By the way, /var/docker/solr_stage2/credentials/ is mapped to
/opt/solr/credentials/ in the docker container, so that's why the paths
*seem* different (but aren't, really).


*Authentication error when data import is run:*

Exception while processing: question document :
SolrInputDocument(fields:
[]):org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: select 'foo' as bar;Processing
Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:323)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not
connect: Access denied for user 'root'@'ICZ2002912' (using password:
NO)
at 
org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:123)
at 
org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(ExceptionMapper.java:71)
at org.mariadb.jdbc.Driver.connect(Driver.java:109)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:192)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:172)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSource.java:503)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:313)
... 12 more
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could
not connect: Access denied for user 'root'@'ICZ2002912' (using
password: NO)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:524)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:472)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:374)
at 
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:763)
at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:469)
at org.mariadb.jdbc.Driver.connect(Driver.java:104)
... 16 more



On Thu, Oct 6, 2016 at 2:42 PM, Jamie Jackson  wrote:

> It happens to be ten characters.
>
> On Thu, Oct 6, 2016 at 12:44 PM, Alexandre Rafalovitch  > wrote:
>
>> How long is the encryption key (file content)? Because the code I am
>> looking at seems to expect it to be at most 100 characters.
>>
>> Regards,
>>Alex.
>> 
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 6 October 2016 at 23:26, Kevin Risden 
>> wrote:
>> > I haven't tried this but is it possible there is a new line at the end
>> in
>> > the file?
>> >
>> > If you did something like echo "" > file.txt then there would be a new
>> > line. Use echo -n "" > file.txt
>> >
>> > Also you should be able to check how many characters are in the file.
>> >
>> > Kevin Risden
>> >
>> > On Wed, Oct 5, 2016 at 5:00 PM, Jamie Jackson 
>> wrote:
>> >
>> >> Hi Folks,
>> >>
>> 

Re: Timeout occured while waiting response from server at: http://***/solr/commodityReview

2016-11-02 Thread Fuad Efendi
My 2 cents (rounded):

Quote: "the size of our index data is more than 30GB every year now”

- is it the size of *data* or the size of *index*? This is super important!

You can have petabytes of data, growing terabytes a year, and your index files 
will grow only few gigabytes a year at most.

Note also that Lucene index files are immutable: it means that, for example, if 
your index files total size is 25Gb in a filesystem, then having at least 
25Gb+2Gb of free RAM available (for index files + for OS) will be beneficial 
(as already mentioned in this thread).

However, caching of index files in a RAM won’t reduce search performance from 
minutes of response time to milliseconds. If you really have timeouts (and I 
believe you use at least 60 seconds timeout settings for SolrJ) then possible 
reasons could be:

1. “Shared VM” such as Amazon shared nodes, sometimes they just stop for few 
minutes
2. Garbage collection in Java
3. Sophisticated Solr query such as faceting and aggregations, with 
inadequately configured field cache and other caches


Having 100Gb index files in a filesystem cannot cause more than a few 
milliseconds response times for trivial queries such as “text:Solr”! 
(Exception: faceting)

You need to isolate (troubleshoot) your timeouts, and you mentioned it only 
happens during new queries to the new searcher after replication from Master to 
Slave. Which means Case #3: improperly configured cache parameters. You need 
warm-up query. New Solr searcher will become available after internal caches 
warmed up (prepopulated with data). 

Memory estimate example: suppose you configured Solr such a way that it will 
use field cache for SKU field. Suppose SKU field is 64 bytes in average (UTF8 
will take 2 bytes per character), and you have 100 millions of documents. Then 
you will need 6,400,000,000 bytes for just this instance of a field cache, more 
than 4Gb! This is basic formula. If you have few such fields, then you will 
need ton of memory, and you need few minutes to warm-up field cache. Calculate 
it properly: 8Gb or 24Gb? Consider sharding / SolrCloud if you need huge memory 
just for field cache. And you will be forced to consider it if you gave more 
that 2 billions documents (am I right? Lucene internal limitation, 
Integer.MAX_INT)



Thanks,


--
Fuad Efendi
(416) 993-2060
http://www.tokenizer.ca
Search Relevancy and Recommender Systems


On November 2, 2016 at 1:11:10 PM, Erick Erickson (erickerick...@gmail.com) 
wrote:

You need to move to SolrCloud when it's  
time to shard ;).  

More seriously, at some point simply adding more  
memory will not be adequate. Either your JVM  
heap will to grow to a point where you start encountering  
GC pauses or the time to serve requests will  
increase unacceptably. "when?" you ask? well  
unfortunately there are no guidelines that can be  
guaranteed, here's a long blog on the subject:  

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
  

The short form is you need to stress-test your  
index and query patterns.  

Now, I've seen 20M docs strain a 32G Java heap. I've  
seen 300M docs give very nice response times with  
12G of memory. It Depends (tm).  

Whether to put Solr on bare metal or not: There's  
inevitably some penalty for a VM. That said there are lots  
of places that use VMs successfully. Again, stress  
testing is the key.  

And finally, using docValues for any field that sorts,  
facets or groups will reduce the JVM requirements  
significantly, albeit by using OS memory space, see  
Uwe's excellent blog:  

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html  

Best,  
Erick  

On Tue, Nov 1, 2016 at 10:23 PM, Kent Mu  wrote:  
> Thanks, I got it, Erick!  
>  
> the size of our index data is more than 30GB every year now, and it is  
> still growing up, and actually our solr now is running on a virtual  
> machine. so I wonder if we need to deploy solr in a physical machine, or I  
> can just upgrade the physical memory of our Virtual machines?  
>  
> Best,  
> Kent  
>  
> 2016-11-02 11:33 GMT+08:00 Erick Erickson :  
>  
>> Kent: OK, I see now. Then a minor pedantic point...  
>>  
>> It'll avoid confusion if you use master and slaves  
>> rather than master and replicas when talking about  
>> non-cloud setups.  
>>  
>> The equivalent in SolrCloud is leader and replicas.  
>>  
>> No big deal either way, just FYI.  
>>  
>> Best,  
>> Erick  
>>  
>> On Tue, Nov 1, 2016 at 8:09 PM, Kent Mu  wrote:  
>> > Thanks a lot for your reply, Shawn!  
>> >  
>> > no other applications on the server, I agree with you that we need to  
>> > upgrade physical memory, and allocate the reasonable jvm size, so that  
>> the  
>> > operating system have spare memory available to cache the index.  
>> >  
>> > actually, we have nearly 100 million of data every year now, and it is  
>> > 

Question about shards, compositeid, and routing

2016-11-02 Thread Michael Joyner (NewsRx)
Ref: 
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud


If an update specifies only the non-routed id, will SolrCloud select the 
correct shard for updating?


If an update specifies a different route, will SolrCloud delete the 
previous document with the same id but with the different routing? (Will 
it effectively change which shard the document is stored on?)


Does the document id have to be unique ignoring the routing prefix? (Is 
the routing prefix considered as part of the id for uniqueness?)



-Mike



Re: Timeout occured while waiting response from server at: http://***/solr/commodityReview

2016-11-02 Thread Erick Erickson
You need to move to SolrCloud when it's
time to shard ;).

More seriously, at some point simply adding more
memory will not be adequate. Either your JVM
heap will to grow to a point where you start encountering
GC pauses or the time to serve requests will
increase unacceptably. "when?" you ask? well
unfortunately there are no guidelines that can be
guaranteed, here's a long blog on the subject:

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

The short form is you need to stress-test your
index and query patterns.

Now, I've seen 20M docs strain a 32G Java heap. I've
seen 300M docs give very nice response times with
12G of memory. It Depends (tm).

Whether to put Solr on bare metal or not: There's
inevitably some penalty for a VM. That said there are lots
of places that use VMs successfully. Again, stress
testing is the key.

And finally, using docValues for any field that sorts,
facets or groups will reduce the JVM requirements
significantly, albeit by using OS memory space, see
Uwe's excellent blog:

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Best,
Erick

On Tue, Nov 1, 2016 at 10:23 PM, Kent Mu  wrote:
> Thanks, I got it, Erick!
>
> the size of our index data is more than 30GB every year now, and it is
> still growing up, and actually our solr now is running on a virtual
> machine. so I wonder if we need to deploy solr in a physical machine, or I
> can just upgrade the physical memory of our Virtual machines?
>
> Best,
> Kent
>
> 2016-11-02 11:33 GMT+08:00 Erick Erickson :
>
>> Kent: OK, I see now. Then a minor pedantic point...
>>
>> It'll avoid confusion if you use master and slaves
>> rather than master and replicas when talking about
>> non-cloud setups.
>>
>> The equivalent in SolrCloud is leader and replicas.
>>
>> No big deal either way, just FYI.
>>
>> Best,
>> Erick
>>
>> On Tue, Nov 1, 2016 at 8:09 PM, Kent Mu  wrote:
>> > Thanks a lot for your reply, Shawn!
>> >
>> > no other applications on the server, I agree with you that we need to
>> > upgrade physical memory, and allocate the reasonable jvm size, so that
>> the
>> > operating system have spare memory available to cache the index.
>> >
>> > actually, we have nearly 100 million of data every year now, and it is
>> > still growing, and actually our solr now is running on a virtual machine.
>> > so I wonder if we need to deploy solr in a physical machine.
>> >
>> > Best Regards!
>> > Kent
>> >
>> > 2016-11-01 21:18 GMT+08:00 Shawn Heisey :
>> >
>> >> On 11/1/2016 1:07 AM, Kent Mu wrote:
>> >> > Hi friends! We come across an issue when we use the solrj(4.9.1) to
>> >> > connect to solr server, our deployment is one master with 10 replicas.
>> >> > we index data to the master, and search data from the replicas via
>> >> > load balancing. the error stack is as below: *Timeout occured while
>> >> > waiting response from server at:
>> >> > http://review.solrsearch3.cnsuning.com/solr/commodityReview
>> >> > *
>> >> > org.apache.solr.client.solrj.SolrServerException: Timeout occured
>> >> > while waiting response from server at:
>> >>
>> >> This shows that you are connecting to port 80.  It is relatively rare to
>> >> run Solr on port 80, though it is possible.  Do you have an intermediate
>> >> layer, like a proxy or a load balancer?  If so, you'll need to ensure
>> >> that there's not a problem there.  If it works normally when replication
>> >> isn't happening, that's probably not a worry.
>> >>
>> >> > It takes place not often. after analysis, we find that only when the
>> >> > replicas Synchronous Data from master solr server. it seem that when
>> >> > the replicas block search requests when synchronizing data from
>> >> > master, is that true?
>> >>
>> >> Solr should be able to continue serving requests while replication
>> >> happens.  I have never heard of this happening before, and I never ran
>> >> into it when I was using replication a long time ago on version 1.4.x.
>> >> I think it is more likely that you've got a memory issue than a bug.  If
>> >> it IS a bug, it will *not* be fixed in a 4.x version, you would need to
>> >> upgrade to 6.x and see whether it's still a problem.  Version 6.2.1 is
>> >> the latest at the moment, and release plans are underway for 6.3 right
>> now.
>> >>
>> >> > I wonder if it is because that our solr server hardware configuration
>> >> > is too low? the physical memory is 8G with 4 cores. and the JVM we set
>> >> > is Xms512m, Xmx7168m.
>> >>
>> >> The following assumes that there is no other software on the server,
>> >> like a database, or an application server, web server, etc.  If there
>> >> is, any issues are likely to be a result of extreme memory starvation,
>> >> and possibly swapping.  Additional physical memory is definitely needed
>> >> if there is other 

Re: edixmax

2016-11-02 Thread Vincenzo D'Amore
Hi Rafael,

I suggest to check all the fields present in your qf looking for one (or
ore) where the stopwords filter is missing.
Very likely there is a field in your qf where the stopword filter is
missing.

The issue you're experiencing is caused by an attempt to match a stopword
on a "non-stopword-filtered" field.
Causing mm=100% to fail.

I also suggest to take a look at mm.autoRelax param for edismax parser.

Best regards,
Vincenzo

On Wed, Nov 2, 2016 at 4:07 PM, Rafael Merino García <
rmer...@paradigmadigital.com> wrote:

> Hi guys,
>
> I came across the following issue. I configured an edixmax query parser
> where *mm=100%* and when the user types in a stopword, no result is being
> returned (stopwords are filtered before indexing, but, somehow, either they
> are not being filtered before searching or they are taken into account when
> computing *mm*). Reading the documentation about the edixmax parser (last
> version) I found the  parameter *stopwords, but changing it has no
> effect...*
>
> Thanks in advance
>
> Regards
>



-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251


edixmax

2016-11-02 Thread Rafael Merino García
Hi guys,

I came across the following issue. I configured an edixmax query parser
where *mm=100%* and when the user types in a stopword, no result is being
returned (stopwords are filtered before indexing, but, somehow, either they
are not being filtered before searching or they are taken into account when
computing *mm*). Reading the documentation about the edixmax parser (last
version) I found the  parameter *stopwords, but changing it has no
effect...*

Thanks in advance

Regards


Re: Heatmap in JSON facet API

2016-11-02 Thread Никита Веневитин
Thank you

2016-11-02 7:35 GMT+03:00 David Smiley :

> I plan on adding this in the near future... hopefully for Solr 6.4.
>
> On Mon, Oct 31, 2016 at 7:06 AM Никита Веневитин <
> venevitinnik...@gmail.com>
> wrote:
>
> > I've built query as described in https://cwiki.apache.org/confluence/x/ZYDxAQ;>Heatmap Faceting,
> > but I would like to get same results using JSON facet API
> >
> > 2016-10-30 15:24 GMT+03:00 GW :
> >
> > > If we are talking about the same kind of heat maps you might want to
> look
> > > at the TomTom map API for a quick and dirty yet solid solution. Just
> > supply
> > > a whack of coordinates and let TomTom do the work. The Heat maps will
> > zoom
> > > in and de-cluster.
> > >
> > > Example below.
> > >
> > > http://www.frogclassifieds.com/tomtom/markers-clustering.html
> > >
> > >
> > > On 28 October 2016 at 09:05, Никита Веневитин <
> venevitinnik...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi!
> > > > Is it possible to use JSON facet API to get heatmaps?
> > > >
> > >
> >
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>