Re: Scaling issue with Solr

2017-12-27 Thread Damien Kamerman
You seem to have the soft and hard commits the wrong way around. Hard
commit is more expensive.

On 28 December 2017 at 09:10, Walter Underwood 
wrote:

> Why are you using Solr for log search? Elasticsearch is widely used for
> log search and has the best infrastructure for that.
>
> For the past few years, it looks like a natural market segmentation is
> happening, with Solr used for product search and ES for log search. By now,
> I would not expect Solr to keep up with ES in log search features.
> Likewise, I would not expect ES to keep up with Solr for product and text
> search features.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Dec 27, 2017, at 1:33 PM, Erick Erickson 
> wrote:
> >
> > You are probably hitting more and more background merging which will
> > slow things down. Your system looks to be severely undersized for this
> > scale.
> >
> > One thing you can try (and I emphasize I haven't prototyped this) is
> > to increase your RamBufferSizeMB solrcofnig.xml setting significantly.
> > By default, Solr won't merge segments to greater than 5G, so
> > theoretically you could just set your ramBufferSizeMB to that figure
> > and avoid merging all together. Or you could try configuring the
> > NoMergePolicy in solrconfig.xml (but beware that you're going to
> > create a lot of segments unless you set the rambuffersize higher).
> >
> > How this will affect your indexing throughput I frankly have no data.
> > You can see that with numbers like this, though, a 4G heap is much too
> > small.
> >
> > Best,
> > Erick
> >
> > On Wed, Dec 27, 2017 at 2:18 AM, Prasad Tendulkar
> >  wrote:
> >> Hello All,
> >>
> >> We have been building a Solr based solution to hold a large amount of
> data (approx 4 TB/day or > 24 Billion documents per day). We are developing
> a prototype on a small scale just to evaluate Solr performance gradually.
> Here is our setup configuration.
> >>
> >> Solr cloud:
> >> node1: 16 GB RAM, 8 Core CPU, 1TB disk
> >> node2: 16 GB RAM, 8 Core CPU, 1TB disk
> >>
> >> Zookeeper is also installed on above 2 machines in cluster mode.
> >> Solr commit intervals: Soft commit 3 minutes, Hard commit 15 seconds
> >> Schema: Basic configuration. 5 fields indexed (out of one is
> text_general), 6 fields stored.
> >> Collection: 12 shards (6 per node)
> >> Heap memory: 4 GB per node
> >> Disk cache: 12 GB per node
> >> Document is a syslog message.
> >>
> >> Documents are being ingested into Solr from different nodes. 12 SolrJ
> clients ingest data into the Solr cloud.
> >>
> >> We are experiencing issues when we keep the setup running for long time
> and after processing around 100 GB of index size (I.e. Around 600 Million
> documents). Note that we are only indexing the data and not querying it. So
> there should not be any query overhead. From the VM analysis we figured out
> that over time the disk operations starts declining and so does the CPU,
> RAM and Network usage of the Solr nodes. We concluded that Solr is unable
> to handle one big collection due to index read/write overhead and most of
> the time it ends up doing only the commit (evident in Solr logs). And
> because of that indexing is getting hampered (?)
> >>
> >> So we thought of creating small sized collections instead of one big
> collection anticipating the commit performance might improve. But
> eventually the performance degrades even with that and we observe more or
> less similar charts for CPU, memory, disk and network.
> >>
> >> To put forth some stats here are the number of documents processed
> every hour
> >>
> >> 1St hour: 250 million
> >> 2nd hour: 250 million
> >> 3rd hour: 240 million
> >> 4th hour: 200 million
> >> .
> >> .
> >> 11th hour: 80 million
> >>
> >> Could you please help us identifying the root cause of degradation in
> the performance? Are we doing something wrong with the Solr configuration
> or the collections/sharding etc? Due to this performance degradation we are
> currently stuck with Solr.
> >>
> >> Thank you very much in advance.
> >>
> >> Prasad Tendulkar
> >>
> >>
>
>


Re: OutOfMemoryError in 6.5.1

2017-11-23 Thread Damien Kamerman
I found the suggesters very memory hungry. I had one particularly large
index where the suggester should have been filtering a small number of
docs, but was mmap'ing the entire index. I only ever saw this behavior with
the suggesters.

On 22 November 2017 at 03:17, Walter Underwood 
wrote:

> All our customizations are in solr.in.sh. We’re using the one we
> configured for 6.3.0. I’ll check for any differences between that and the
> 6.5.1 script.
>
> I don’t see any arguments at all in the dashboard. I do see them in a ps
> listing, right at the end.
>
> java -server -Xms8g -Xmx8g -XX:+UseG1GC -XX:+ParallelRefProcEnabled
> -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=200 -XX:+UseLargePages
> -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -verbose:gc
> -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution 
> -XX:+PrintGCApplicationStoppedTime
> -Xloggc:/solr/logs/solr_gc.log -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M
> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=18983 
> -Dcom.sun.management.jmxremote.rmi.port=18983
> -Djava.rmi.server.hostname=new-solr-c01.test3.cloud.cheggnet.com
> -DzkClientTimeout=15000 -DzkHost=zookeeper1.test3.cloud.cheggnet.com:2181,
> zookeeper2.test3.cloud.cheggnet.com:2181,zookeeper3.test3.cloud.
> cheggnet.com:2181/solr-cloud -Dsolr.log.level=WARN
> -Dsolr.log.dir=/solr/logs -Djetty.port=8983 -DSTOP.PORT=7983
> -DSTOP.KEY=solrrocks -Dhost=new-solr-c01.test3.cloud.cheggnet.com
> -Duser.timezone=UTC -Djetty.home=/apps/solr6/server
> -Dsolr.solr.home=/apps/solr6/server/solr -Dsolr.install.dir=/apps/solr6
> -Dgraphite.prefix=solr-cloud.new-solr-c01 -Dgraphite.host=influx.test.
> cheggnet.com -javaagent:/apps/solr6/newrelic/newrelic.jar
> -Dnewrelic.environment=test3 -Dsolr.log.muteconsole -Xss256k
> -Dsolr.log.muteconsole -XX:OnOutOfMemoryError=/apps/solr6/bin/oom_solr.sh
> 8983 /solr/logs -jar start.jar --module=http
>
> I’m still confused why we are hitting OOM in 6.5.1 but weren’t in 6.3.0.
> Our load benchmarks use prod logs. We added suggesters, but those use
> analyzing infix, so they are search indexes, not in-memory.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Nov 21, 2017, at 5:46 AM, Shawn Heisey  wrote:
> >
> > On 11/20/2017 6:17 PM, Walter Underwood wrote:
> >> When I ran load benchmarks with 6.3.0, an overloaded cluster would get
> super slow but keep functioning. With 6.5.1, we hit 100% CPU, then start
> getting OOMs. That is really bad, because it means we need to reboot every
> node in the cluster.
> >> Also, the JVM OOM hook isn’t running the process killer (JVM
> 1.8.0_121-b13). Using the G1 collector with the Shawn Heisey settings in an
> 8G heap.
> > 
> >> This is not good behavior in prod. The process goes to the bad place,
> then we need to wait until someone is paged and kills it manually. Luckily,
> it usually drops out of the live nodes for each collection and doesn’t take
> user traffic.
> >
> > There was a bug, fixed long before 6.3.0, where the OOM killer script
> wasn't working because the arguments enabling it were in the wrong place.
> It was fixed in 5.5.1 and 6.0.
> >
> > https://issues.apache.org/jira/browse/SOLR-8145
> >
> > If the scripts that you are using to get Solr started originated with a
> much older version of Solr than you are currently running, maybe you've got
> the arguments in the wrong order.
> >
> > Do you see the commandline arguments for the OOM killer (only available
> on *NIX systems, not Windows) on the admin UI dashboard?  If they are
> properly placed, you will see them on the dashboard, but if they aren't
> properly placed, then you won't see them.  This is what the argument looks
> like for one of my Solr installs:
> >
> > -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /var/solr/logs
> >
> > Something which you probably already know:  If you're hitting OOM, you
> need a larger heap, or you need to adjust the config so it uses less
> memory.  There are no other ways to "fix" OOM problems.
> >
> > Thanks,
> > Shawn
>
>


Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-19 Thread Damien Kamerman
A suggester rebuild will mmap the entire index. So'll you need free memory
for depending on your index size.

On 19 September 2017 at 13:47, shamik  wrote:

> I agree, should have made it clear in my initial post. The reason I thought
> it's little trivial since the newly introduced collection has only few
> hundred documents and is not being used in search yet. Neither it's being
> indexed at a regular interval. The cache parameters are kept to a minimum
> as
> well. But there might be overheads of a simply creating a collection which
> I'm not aware of.
>
> I did bring down the heap size to 8gb, changed to G1 and reduced the cache
> params. The memory so far has been holding up but will wait for a while
> before passing on a judgment.
>
>  autowarmCount="0"/>
>  autowarmCount="0"/>
>  autowarmCount="0"/>
>  initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" />
>  showItems="0" />
>
> The change seemed to have increased the number of slow queries (1000 ms),
> but I'm willing to address the OOM over performance at this point. One
> thing
> I realized is that I provided the wrong index size here. It's 49gb instead
> of 25, which I mistakenly picked from one shard. I hope the heap size will
> continue to sustain for the index size.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Allow Join over two sharded collection

2017-06-29 Thread Damien Kamerman
Joins will work with shards as long as the docs you're joining from/to are
in the shard. Why not go compositeId routing (either ID=uniqueKey!docId or
router.field)? Is there no 'uniqueKey' which will distribute randomly? You
may need to put the same ACL docs in all shards depending on your use case.

On 30 June 2017 at 12:57, mganeshs  wrote:

> Hi Erick,
>
> Initially I also thought of using Streaming for Joins. But looks like Joins
> with Streaming is not for heavy QPS sort of queries and that's my use case.
> Currently things are working fine with normal join for us as we have only
> one shard. But in coming days number of documents to be indexed is going to
> be increased drastically. So we need to split shards. The time I split
> shards I can't use Joins.
>
> We thought of going with Implict routing for sharding. But if we go with
> Implicit routing, all indexing will not be distributed and so one shard
> could be getting more load which we don't want.
> So we badly looking for default Join.
> As I have posted in different questions in this forum itself and you too
> have replied our joins are between real documents and it's ACL
> documents. ACL document has multi value field whose value would be user or
> groups. Why we want to keep ACL separately instead of keeping it in same
> real document itself. It's because that our ACL can grow till 1L of users
> or
> even more. and for every change in ACL or its permission we don't want to
> re-index the real document as well.
>
> Do you think is there any better alternative ? or the way we have kept ACLs
> are wrong ?
>
> Regards,
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Allow-Join-over-two-sharded-collection-tp4343443p4343582.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: async backup

2017-06-27 Thread Damien Kamerman
yes. Requeststatus is returning state=completed prematurely.

On Tuesday, 27 June 2017, Amrit Sarkar <sarkaramr...@gmail.com> wrote:

> Damien,
>
> then I poll with REQUESTSTATUS
>
>
> REQUESTSTATUS is an API which provided you the status of the any API
> (including other heavy duty apis like SPLITSHARD or CREATECOLLECTION)
> associated with async_id at that current timestamp / moment. Does that give
> you "state"="completed"?
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Tue, Jun 27, 2017 at 5:25 AM, Damien Kamerman <dami...@gmail.com
> <javascript:;>> wrote:
>
> > A regular backup creates the files in this order:
> > drwxr-xr-x   2 root root  63 Jun 27 09:46 snapshot.shard7
> > drwxr-xr-x   2 root root 159 Jun 27 09:46 snapshot.shard8
> > drwxr-xr-x   2 root root 135 Jun 27 09:46 snapshot.shard1
> > drwxr-xr-x   2 root root 178 Jun 27 09:46 snapshot.shard3
> > drwxr-xr-x   2 root root 210 Jun 27 09:46 snapshot.shard11
> > drwxr-xr-x   2 root root 218 Jun 27 09:46 snapshot.shard9
> > drwxr-xr-x   2 root root 180 Jun 27 09:46 snapshot.shard2
> > drwxr-xr-x   2 root root 164 Jun 27 09:47 snapshot.shard5
> > drwxr-xr-x   2 root root 252 Jun 27 09:47 snapshot.shard6
> > drwxr-xr-x   2 root root 103 Jun 27 09:47 snapshot.shard12
> > drwxr-xr-x   2 root root 135 Jun 27 09:47 snapshot.shard4
> > drwxr-xr-x   2 root root 119 Jun 27 09:47 snapshot.shard10
> > drwxr-xr-x   3 root root   4 Jun 27 09:47 zk_backup
> > -rw-r--r--   1 root root 185 Jun 27 09:47 backup.properties
> >
> > While an async backup creates files in this order:
> > drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard3
> > drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard9
> > drwxr-xr-x   2 root root  62 Jun 27 09:49 snapshot.shard6
> > drwxr-xr-x   2 root root  37 Jun 27 09:49 snapshot.shard2
> > drwxr-xr-x   2 root root  67 Jun 27 09:49 snapshot.shard7
> > drwxr-xr-x   2 root root  75 Jun 27 09:49 snapshot.shard5
> > drwxr-xr-x   2 root root  70 Jun 27 09:49 snapshot.shard8
> > drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard4
> > drwxr-xr-x   2 root root  15 Jun 27 09:50 snapshot.shard11
> > drwxr-xr-x   2 root root 127 Jun 27 09:50 snapshot.shard1
> > drwxr-xr-x   2 root root 116 Jun 27 09:50 snapshot.shard12
> > drwxr-xr-x   3 root root   4 Jun 27 09:50 zk_backup
> > -rw-r--r--   1 root root 185 Jun 27 09:50 backup.properties
> > drwxr-xr-x   2 root root  25 Jun 27 09:51 snapshot.shard10
> >
> >
> > shard10 is much larger than the other shards.
> >
> > From the logs:
> > INFO  - 2017-06-27 09:50:33.832; [   ] org.apache.solr.cloud.BackupCmd;
> > Completed backing up ZK data for backupName=collection1
> > INFO  - 2017-06-27 09:50:33.800; [   ]
> > org.apache.solr.handler.admin.CoreAdminOperation; Checking request
> status
> > for : backup1103459705035055
> > INFO  - 2017-06-27 09:50:33.800; [   ]
> > org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null
> > path=/admin/cores
> > params={qt=/admin/cores=backup1103459705035055=
> > REQUESTSTATUS=javabin=2}
> > status=0 QTime=0
> > INFO  - 2017-06-27 09:51:33.405; [   ] org.apache.solr.handler.
> > SnapShooter;
> > Done creating backup snapshot: shard10 at file:///online/backup/
> > collection1
> >
> > Has anyone seen this bug, or knows a workaround?
> >
> >
> > On 27 June 2017 at 09:47, Damien Kamerman <dami...@gmail.com
> <javascript:;>> wrote:
> >
> > > Yes, the async command returns, and then I poll with REQUESTSTATUS.
> > >
> > > On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in
> <javascript:;>> wrote:
> > >
> > >> Hi Damien,
> > >>
> > >> A backup command with async is supposed to return early. It is start
> the
> > >> backup process and return.
> > >>
> > >> Are you using the REQUESTSTATUS (
> > >> http://lucene.apache.org/solr/guide/6_6/collections-api.html
> > >> #collections-api
> > >> ) API to validate if the backup is complete?
> > >>
> > >>

Re: async backup

2017-06-26 Thread Damien Kamerman
A regular backup creates the files in this order:
drwxr-xr-x   2 root root  63 Jun 27 09:46 snapshot.shard7
drwxr-xr-x   2 root root 159 Jun 27 09:46 snapshot.shard8
drwxr-xr-x   2 root root 135 Jun 27 09:46 snapshot.shard1
drwxr-xr-x   2 root root 178 Jun 27 09:46 snapshot.shard3
drwxr-xr-x   2 root root 210 Jun 27 09:46 snapshot.shard11
drwxr-xr-x   2 root root 218 Jun 27 09:46 snapshot.shard9
drwxr-xr-x   2 root root 180 Jun 27 09:46 snapshot.shard2
drwxr-xr-x   2 root root 164 Jun 27 09:47 snapshot.shard5
drwxr-xr-x   2 root root 252 Jun 27 09:47 snapshot.shard6
drwxr-xr-x   2 root root 103 Jun 27 09:47 snapshot.shard12
drwxr-xr-x   2 root root 135 Jun 27 09:47 snapshot.shard4
drwxr-xr-x   2 root root 119 Jun 27 09:47 snapshot.shard10
drwxr-xr-x   3 root root   4 Jun 27 09:47 zk_backup
-rw-r--r--   1 root root 185 Jun 27 09:47 backup.properties

While an async backup creates files in this order:
drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard3
drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard9
drwxr-xr-x   2 root root  62 Jun 27 09:49 snapshot.shard6
drwxr-xr-x   2 root root  37 Jun 27 09:49 snapshot.shard2
drwxr-xr-x   2 root root  67 Jun 27 09:49 snapshot.shard7
drwxr-xr-x   2 root root  75 Jun 27 09:49 snapshot.shard5
drwxr-xr-x   2 root root  70 Jun 27 09:49 snapshot.shard8
drwxr-xr-x   2 root root  15 Jun 27 09:49 snapshot.shard4
drwxr-xr-x   2 root root  15 Jun 27 09:50 snapshot.shard11
drwxr-xr-x   2 root root 127 Jun 27 09:50 snapshot.shard1
drwxr-xr-x   2 root root 116 Jun 27 09:50 snapshot.shard12
drwxr-xr-x   3 root root   4 Jun 27 09:50 zk_backup
-rw-r--r--   1 root root 185 Jun 27 09:50 backup.properties
drwxr-xr-x   2 root root  25 Jun 27 09:51 snapshot.shard10


shard10 is much larger than the other shards.

>From the logs:
INFO  - 2017-06-27 09:50:33.832; [   ] org.apache.solr.cloud.BackupCmd;
Completed backing up ZK data for backupName=collection1
INFO  - 2017-06-27 09:50:33.800; [   ]
org.apache.solr.handler.admin.CoreAdminOperation; Checking request status
for : backup1103459705035055
INFO  - 2017-06-27 09:50:33.800; [   ]
org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores
params={qt=/admin/cores=backup1103459705035055=REQUESTSTATUS=javabin=2}
status=0 QTime=0
INFO  - 2017-06-27 09:51:33.405; [   ] org.apache.solr.handler.SnapShooter;
Done creating backup snapshot: shard10 at file:///online/backup/collection1

Has anyone seen this bug, or knows a workaround?


On 27 June 2017 at 09:47, Damien Kamerman <dami...@gmail.com> wrote:

> Yes, the async command returns, and then I poll with REQUESTSTATUS.
>
> On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote:
>
>> Hi Damien,
>>
>> A backup command with async is supposed to return early. It is start the
>> backup process and return.
>>
>> Are you using the REQUESTSTATUS (
>> http://lucene.apache.org/solr/guide/6_6/collections-api.html
>> #collections-api
>> ) API to validate if the backup is complete?
>>
>> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <dami...@gmail.com>
>> wrote:
>>
>> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
>> > command returning early. The state is finished well before one shard is
>> > finished.
>> >
>> > The collection I'm backing up has 12 shards across 6 nodes and I suspect
>> > the issue is that it is not waiting for all backups on the node to
>> finish.
>> >
>> > Alternatively, I if I change the request to not be async it works OK but
>> > sometimes I get the exception "backup the collection time out:180s".
>> >
>> > Has anyone seen this, or knows a workaround?
>> >
>> > Cheers,
>> > Damien.
>> >
>>
>
>


Re: async backup

2017-06-26 Thread Damien Kamerman
Yes, the async command returns, and then I poll with REQUESTSTATUS.

On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote:

> Hi Damien,
>
> A backup command with async is supposed to return early. It is start the
> backup process and return.
>
> Are you using the REQUESTSTATUS (
> http://lucene.apache.org/solr/guide/6_6/collections-api.
> html#collections-api
> ) API to validate if the backup is complete?
>
> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <dami...@gmail.com>
> wrote:
>
> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
> > command returning early. The state is finished well before one shard is
> > finished.
> >
> > The collection I'm backing up has 12 shards across 6 nodes and I suspect
> > the issue is that it is not waiting for all backups on the node to
> finish.
> >
> > Alternatively, I if I change the request to not be async it works OK but
> > sometimes I get the exception "backup the collection time out:180s".
> >
> > Has anyone seen this, or knows a workaround?
> >
> > Cheers,
> > Damien.
> >
>


async backup

2017-06-25 Thread Damien Kamerman
I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
command returning early. The state is finished well before one shard is
finished.

The collection I'm backing up has 12 shards across 6 nodes and I suspect
the issue is that it is not waiting for all backups on the node to finish.

Alternatively, I if I change the request to not be async it works OK but
sometimes I get the exception "backup the collection time out:180s".

Has anyone seen this, or knows a workaround?

Cheers,
Damien.


Re: Issue with highlighter

2017-06-15 Thread Damien Kamerman
Ali, does adding a 'hl.q' param help?  q=something=something&...

On 16 June 2017 at 06:21, Ali Husain  wrote:

> Thanks for the replies. Let me try and explain this a little better.
>
>
> I haven't modified anything in solrconfig. All I did was get a fresh
> instance of solr 6.4.1 and create a core testHighlight. I then created a
> content field of type text_en via the Solr Admin UI. id was already there,
> and that is of type string.
>
>
> I then use the UI, once again to check the hl checkbox, hl.fl is set to *
> because I want any and every match.
>
>
> I push the following content into this new solr instance:
>
> id:91101
>
> content:'I am adding something to the core field and we will try and find
> it. We want to make sure the highlighter works!
>
> This is short so fragsize and max characters shouldn\'t be an issue.'
>
> As you can see, very few characters, fragsize, maxAnalyzedChars, all that
> should not be an issue.
>
>
> I then send this query:
>
> http://localhost:8983/solr/testHighlight/select?hl.fl=*;
> hl=on=on=something=json
>
>
> My results:
>
>
> "response":{"numFound":1,"start":0,"docs":[
>
> {"id":"91101",
>
> "content":"I am adding something to the core field and we will try
> and find it. We want to make sure the highlighter works! This is short so
> fragsize and max characters shouldn't be an issue.",
> "_version_":1570302668841156608}]
>
>
> },
>
>
> "highlighting":{
> "91101":{}}
>
>
> I change q to be core instead of something.
>
>
> http://localhost:8983/solr/testHighlight/select?hl.fl=*;
> hl=on=on=core=json
>
>
> {
> "id":"91101",
> "content":"I am adding something to the core field and we will try
> and find it. We want to make sure the highlighter works! This is short so
> fragsize and max characters shouldn't be an issue.",
> "_version_":1570302668841156608},
>
>
>
> "highlighting":{
> "91101":{
>   "content":["I am adding something to the core field and we
> will try and find it. We want to make sure"]}}
>
> I've tried a bunch of queries. 'adding', 'something' both don't return any
> highlights. 'core' 'am' 'field' all work.
>
> Am I doing a better job of explaining this? Quite puzzling why this would
> be happening. My guess is there is some file/config somewhere that is
> ignoring some words? It isn't stopwords.txt in my case though. If that
> isn't the case then it definitely seems like a bug to me.
>
> Thanks, Ali
>
>
> 
> From: David Smiley 
> Sent: Thursday, June 15, 2017 12:33:39 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Issue with highlighter
>
> > Beware of NOT plus OR in a search. That will certainly produce no
> highlights. (eg test -results when default op is OR)
>
> Seems like a bug to me; the default operator shouldn't matter in that case
> I think since there is only one clause that has no BooleanQuery.Occur
> operator and thus the OR/AND shouldn't matter.  The end effect is "test" is
> effectively required and should definitely be highlighted.
>
> Note to Ali: Phil's comment implies use of hl.method=unified which is not
> the default.
>
> On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden 
> wrote:
>
> > Just had similar issue - works for some, not others. First thing to look
> > at is hl.maxAnalyzedChars is the query. The default is quite small.
> > Since many of my documents are large PDF files, I opted to use
> > storeOffsetsWithPositions="true" termVectors="true" on the field I was
> > searching on.
> > This certainly did increase my index size but not too bad and certainly
> > fast.
> > https://cwiki.apache.org/confluence/display/solr/Highlighting
> >
> > Beware of NOT plus OR in a search. That will certainly produce no
> > highlights. (eg test -results when default op is OR)
> >
> >
> > -Original Message-
> > From: Ali Husain [mailto:alihus...@outlook.com]
> > Sent: Thursday, 15 June 2017 11:11 a.m.
> > To: solr-user@lucene.apache.org
> > Subject: Issue with highlighter
> >
> > Hi,
> >
> >
> > I think I've found a bug with the highlighter. I search for the word
> > "something" and I get an empty highlighting response for all the
> documents
> > that are returned shown below. The fields that I am searching over are
> > text_en, the highlighter works for a lot of queries. I have no
> > stopwords.txt list that could be messing this up either.
> >
> >
> >  "highlighting":{
> > "310":{},
> > "103":{},
> > "406":{},
> > "1189":{},
> > "54":{},
> > "292":{},
> > "309":{}}}
> >
> >
> > Just changing the search term to "something like" I get back this:
> >
> >
> > "highlighting":{
> > "310":{},
> > "309":{
> >   "content":["1949 Convention, like those"]},
> > "103":{},
> > "406":{},
> > "1189":{},
> > "54":{},
> > "292":{},
> > "286":{
> >   "content":["persons in these classes are treated like
> > combatants, but in other 

Re: Rule-based Replica Placement not working with Solr 6.5.1

2017-05-23 Thread Damien Kamerman
I'm not sure I fully understand what you're trying to do but this is what I
do to ensure replicas are not on the same rack:

rule=shard:*,replica:<2,sysprop.rack:*

On 23 May 2017 at 22:37, Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
wrote:

> Yes, I tried that already.
> Sure, it assigns 2 nodes with port 8983 to shard1 (e.g.
> server1:8983,server2:8983).
> But due to no replica rule (which defaults to wildcard) I also get
> shard3 --> server2:8983,server2:7574
> shard2 --> server1:7574,server3:8983
>
> The result is 3 replicas on server2 and also 2 replicas on one node of
> server2
> but _no_ replica on node server3:7574.
>
> I also tried to really nail it down with the rule:
> rule=shard:shard1,replica:<2,sysprop.rack:1&
> rule=shard:shard2,replica:<2,sysprop.rack:2&
> rule=shard:shard3,replica:<2,sysprop.rack:3
>
> The nodes were started with the correct -Drack=x property, but no luck.
>
> From debugging I can see that the code is "over complicated" written.
> Probably to catch all possibilities (core, node, port, ip_x,...) but with
> the lack
> not really trying all permutations and obeying the rules.
>
> I will open a ticket for this.
>
> Regards
> Bernd
>
> Am 23.05.2017 um 14:09 schrieb Noble Paul:
> > did you try the rule
> > shard:shard1,port:8983
> >
> > this ensures that all replicas of shard1 is allocated in the node w/
> port 8983.
> >
> > if it doesn't , it's a bug. Please open  aticket
> >
> > On Tue, May 23, 2017 at 7:10 PM, Bernd Fehling
> > <bernd.fehl...@uni-bielefeld.de> wrote:
> >> After some analysis it turns out that they compare apples with oranges
> :-(
> >>
> >> Inside "tryAPermutationOfRules" the rule is called with rules.get() and
> >> the next step is calling rule.compare(), but they don't compare the
> nodes
> >> against the rule (or rules). They compare the nodes against each other.
> >>
> >> E.g. server1:8983, server2:7574, server1:7574,...
> >> What do you think will happen if comparing server1:8983 against
> server2:7574 (and so on)???
> >> It will _NEVER_ match!!!
> >>
> >> Regards
> >> Bernd
> >>
> >>
> >> Am 23.05.2017 um 08:54 schrieb Bernd Fehling:
> >>> No, that is way off, because:
> >>> 1. you have no "tag" defined.
> >>>shard and replica can be omitted and they will default to wildcard,
> >>>but a "tag" must be defined.
> >>> 2. replica must be an integer or a wildcard.
> >>>
> >>> Regards
> >>> Bernd
> >>>
> >>> Am 23.05.2017 um 01:17 schrieb Damien Kamerman:
> >>>> If you want all the replicas for shard1 on the same port then I think
> the
> >>>> rule is: 'shard:shard1,replica:port:8983'
> >>>>
> >>>> On 22 May 2017 at 18:47, Bernd Fehling <bernd.fehling@uni-bielefeld.
> de>
> >>>> wrote:
> >>>>
> >>>>> I tried many settings with "Rule-based Replica Placement" on Solr
> 6.5.1
> >>>>> and came to the conclusion that it is not working at all.
> >>>>>
> >>>>> My test setup is 6 nodes on 3 servers (port 8983 and 7574 on each
> server).
> >>>>>
> >>>>> The call to create a new collection is
> >>>>> "http://localhost:8983/solr/admin/collections?action=
> CREATE=boss&
> >>>>> collection.configName=boss_configs=3=2&
> >>>>> maxShardsPerNode=1=shard:shard1,replica:<2,port:8983"
> >>>>>
> >>>>> With "rule=shard:shard1,replica:<2,port:8983" I expect that shard1
> has
> >>>>> only nodes with port 8983 _OR_ it shoud fail due to "strict mode"
> because
> >>>>> the fuzzy operator "~" it not set.
> >>>>>
> >>>>> The result of the call is:
> >>>>> shard1 --> server2:7574 / server1:8983
> >>>>> shard2 --> server1:7574 / server3:8983
> >>>>> shard3 --> server2:8983 / server3:7574
> >>>>>
> >>>>> The expected result should be (at least!!!) shard1 --> server_x:8983
> /
> >>>>> server_y:8983
> >>>>> where "_x" and "_y" can be anything between 1 and 3 but must be
> different.
> >>>>>
> >>>>> I think the problem is somewhere in "class ReplicaAssigner" with
> >>>>> "tryAllPermutations"
> >>>>> and "tryAPermutationOfRules".
> >>>>>
> >>>>> Regards
> >>>>> Bernd
> >>>>>
> >>>>
>


Re: Rule-based Replica Placement not working with Solr 6.5.1

2017-05-22 Thread Damien Kamerman
If you want all the replicas for shard1 on the same port then I think the
rule is: 'shard:shard1,replica:port:8983'

On 22 May 2017 at 18:47, Bernd Fehling 
wrote:

> I tried many settings with "Rule-based Replica Placement" on Solr 6.5.1
> and came to the conclusion that it is not working at all.
>
> My test setup is 6 nodes on 3 servers (port 8983 and 7574 on each server).
>
> The call to create a new collection is
> "http://localhost:8983/solr/admin/collections?action=CREATE=boss;
> collection.configName=boss_configs=3=2&
> maxShardsPerNode=1=shard:shard1,replica:<2,port:8983"
>
> With "rule=shard:shard1,replica:<2,port:8983" I expect that shard1 has
> only nodes with port 8983 _OR_ it shoud fail due to "strict mode" because
> the fuzzy operator "~" it not set.
>
> The result of the call is:
> shard1 --> server2:7574 / server1:8983
> shard2 --> server1:7574 / server3:8983
> shard3 --> server2:8983 / server3:7574
>
> The expected result should be (at least!!!) shard1 --> server_x:8983 /
> server_y:8983
> where "_x" and "_y" can be anything between 1 and 3 but must be different.
>
> I think the problem is somewhere in "class ReplicaAssigner" with
> "tryAllPermutations"
> and "tryAPermutationOfRules".
>
> Regards
> Bernd
>


Re: Join not working in Solr 6.5

2017-05-22 Thread Damien Kamerman
I use a router.field so docs that I join from/to are always in the same
shard.  See
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud#ShardsandIndexingDatainSolrCloud-DocumentRouting

There is an open ticket SOLR-8297
https://issues.apache.org/jira/browse/SOLR-8297 Allow join query over 2
sharded collections: enhance functionality and exception handling



On 22 May 2017 at 16:01, mganeshs  wrote:

> Is there any possibility of supporting joins across multiple shards in near
> future ? How to achieve the join when our data is spread-ed across multiple
> shards. This is very much mandatory when we need to scale out.
>
> Any workarounds if out-of-box possibility is not there ?
>
> Thanks,
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Join-not-working-in-Solr-6-5-tp4336247p4336256.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Join not working in Solr 6.5

2017-05-21 Thread Damien Kamerman
Your join should be:

{!join from=id to=C_pid_s}

On 22 May 2017 at 14:07, mganeshs  wrote:

> Hi,
>
> I have following records / documents with Parent entity
>
> id,type_s,P_hid_s,P_name_s,P_pid_s
> 11,PERSON,11,Parent1,11
>
> And following records / documents with child entity
>
> id,type_s,C_hid_s,C_name_s,C_pid_s
> 12,PERSON,12,Child2,11
> 13,PERSON,13,Child3,11
> 14,PERSON,14,Child4,11
>
> Now when I try to join and get all children of parent1 whose id is
> 11,
>
> http://localhost:8983/solr/basicns/select?indent=on={!join from id to
> C_pid_s} type_s:PERSON=json
>
>
> I get following exception
>  "error":{
> "trace":"java.lang.NullPointerException\r\n\tat
> org.apache.solr.search.JoinQuery.hashCode(JoinQParserPlugin.java:525)\r\
> n\tat
> org.apache.solr.search.QueryResultKey.(QueryResultKey.java:46)\r\n\
> tat
> org.apache.solr.search.SolrIndexSearcher.getDocListC(
> SolrIndexSearcher.java:1754)\r\n\tat
> org.apache.solr.search.SolrIndexSearcher.search(
> SolrIndexSearcher.java:609)\r\n\tat
> org.apache.solr.handler.component.QueryComponent.
> process(QueryComponent.java:547)\r\n\tat
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(
> SearchHandler.java:295)\r\n\tat
> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:173)\r\n\tat
> org.apache.solr.core.SolrCore.execute(SolrCore.java:2440)\r\n\tat
> org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.java:723)\r\n\tat
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)\r\n\tat
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:347)\r\n\tat
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:298)\r\n\tat
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)\r\n\tat
> org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:582)\r\n\tat
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)\r\n\tat
> org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)\r\n\tat
> org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)\r\n\tat
> org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)\r\n\tat
> org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)\r\n\tat
> org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)\r\n\tat
> org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)\r\n\tat
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)\r\n\tat
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)\r\n\tat
> org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)\r\n\tat
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)\r\n\tat
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> RewriteHandler.java:335)\r\n\tat
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)\r\n\tat
> org.eclipse.jetty.server.Server.handle(Server.java:534)\r\n\tat
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\r\n\tat
> org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)\r\n\tat
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)\r\n\tat
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\r\n\tat
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)\r\n\tat
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)\r\n\tat
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)\r\n\tat
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)\r\n\tat
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)\r\n\tat
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)\r\n\tat
> java.lang.Thread.run(Thread.java:745)\r\n",
> "code":500}}
>
>
> Is there a bug in 6.5? or something going wrong. I have used basic config
> comes with example and created collection with one shard only and not using
> multiple shards.
>
> Early response will be very much appreciated
>
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Join-not-working-in-Solr-6-5-tp4336247.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Slow Bulk InPlace DocValues updates

2017-05-18 Thread Damien Kamerman
Adding more shards will scale your writes.

On 18 May 2017 at 20:08, Dan .  wrote:

> Hi,
>
> -Solr 6.5.1
> -SSD disk
> -23M docs index 64G single shard
>
> I'm trying to do around 4M in-place docValue updates to a collection
> (single shard or around 23M docs) [these are ALL in-place updates]
>
>  I can add the updates in around 7mins, but flushing to disk takes around
> 40mins! I've been able to add the updates quickly by adding:
>
> 
> 4000
>   
>
> autoSoftCommit/autoCommit currently disabled.
>
> From the thread dump I see that the flush is in a single thread and
> extremely slow. Dump below, the culprit seems to be [
>
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(
> BufferedUpdatesStream.java:666)]
>
> :
>
>
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> pushFrame​(SegmentTermsEnum.java:256)
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> pushFrame​(SegmentTermsEnum.java:248)
>-
>org.apache.lucene.codecs.blocktree.SegmentTermsEnum.
> seekExact​(SegmentTermsEnum.java:538)
>
>
>
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(
> BufferedUpdatesStream.java:666)
>-
>org.apache.lucene.index.BufferedUpdatesStream.
> applyDocValuesUpdatesList​(BufferedUpdatesStream.java:612)
>-
>org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates​(
> BufferedUpdatesStream.java:269)
>-
>org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates​(
> IndexWriter.java:3454)
>-
>org.apache.lucene.index.IndexWriter.applyDeletesAndPurge​(
> IndexWriter.java:4990)
>-
>org.apache.lucene.index.DocumentsWriter$ApplyDeletesEvent.process​(
> DocumentsWriter.java:717)
>-
>org.apache.lucene.index.IndexWriter.processEvents​(
> IndexWriter.java:5040)
>-
>org.apache.lucene.index.IndexWriter.processEvents​(
> IndexWriter.java:5031)
>-
>org.apache.lucene.index.IndexWriter.updateDocValues​(
> IndexWriter.java:1731)
>-
>org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues​(
> DirectUpdateHandler2.java:911)
>-
>org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate​(
> DirectUpdateHandler2.java:302)
>-
>org.apache.solr.update.DirectUpdateHandler2.addDoc0​(
> DirectUpdateHandler2.java:239)
>-
>org.apache.solr.update.DirectUpdateHandler2.addDoc​(
> DirectUpdateHandler2.java:194)
>
>
> I think this is related to
> SOLR-6838 [https://issues.apache.org/jira/browse/SOLR-6838]
> and
> LUCENE-6161 [https://issues.apache.org/jira/browse/LUCENE-6161]
>
> I need to make the flush faster, to complete the update quicker. Has anyone
> a workaround or have any suggestions?
>
> Many thanks,
> Dan
>


Re: SolrJ - How to add a blocked document without child documents

2017-05-15 Thread Damien Kamerman
Does this fl help?

fl=*,[child childFilter="docType:child" parentFilter=docType:parent]

On 14 May 2017 at 16:16, Jeffery Yuan  wrote:

> Nested documents is quite useful to model structural hierarchy data.
>
> Sometimes, we only have parent document which doesn't have child documents
> yet, we want to add it first, and then later update it: re-add the whole
> document including the parent documents and its all child documents.
>
> But we found out that in the server, there would be two parent documents
> with same id: one without child document, the other one which contains
> child
> documents.
>
> http://localhost:8983/solr/thecollection_shard1_replica2/
> select?q=id:*=*,[docid]=false
> 
>   
> parent
> 9816c0f3-f3ae-4a7c-a5fe-89a2c481467a
> 0
>   
>   
> child
> e27d2709-2dc0-439d-b017-4d95212bf05f
> 
>   9816c0f3-f3ae-4a7c-a5fe-89a2c481467a
> 
> 1
>   
>   
> parent
> 9816c0f3-f3ae-4a7c-a5fe-89a2c481467a
> 
>   9816c0f3-f3ae-4a7c-a5fe-89a2c481467a
> 
> 2
>   
> 
>
> How I can avoid the duplicate parent documents?
> How could I add a blocked document without child documents?
>
> - I can workaround this by delete first before add new documents but the
> performance would suffer
>
> Thanks a lot for your help and response.
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SolrJ-How-to-add-a-blocked-document-without-
> child-documents-tp4335006.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Suggester uses lots of 'Page cache' memory

2017-05-09 Thread Damien Kamerman
Memory/cache aside, the fundamental Solr issue is that the Suggester build
operation will read the entire index, even though very few docs have the
relevant fields.

Is there a way to set a 'fq' on the Suggester build?

   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135)
at
org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState.document(CompressingStoredFieldsReader.java:560)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.document(CompressingStoredFieldsReader.java:576)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:583)
at org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
at
org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
at
org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:118)
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:381)
at
org.apache.lucene.search.suggest.DocumentDictionary$DocumentInputIterator.next(DocumentDictionary.java:165)
at
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.build(AnalyzingInfixSuggester.java:300)
- locked <0x0004b8f29260> (a java.lang.Object)
at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:190)
at
org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:178)
at
org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:179)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299)
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)


On 3 May 2017 at 12:47, Damien Kamerman <dami...@gmail.com> wrote:

> Thanks Shawn, I'll have to look closer into this.
>
> On 3 May 2017 at 12:10, Shawn Heisey <apa...@elyograg.org> wrote:
>
>> On 5/2/2017 6:46 PM, Damien Kamerman wrote:
>> > Shalin, yes I think it's a case of the Suggester build hitting the index
>> > all at once. I'm thinking it's hitting all docs, even the ones without
>> > fields relevant to the suggester.
>> >
>> > Shawn, I am using ZFS, though I think it's comparable to other setups.
>> > mmap() should still be faster, while the ZFS ARC cache may prefer more
>> > memory that other OS disk caches.
>> >
>> > So, it sounds like I enough memory/swap to hold the entire index. When
>> will
>> > the memory be released? On a commit?
>> > https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/
>> store/MMapDirectory.html
>> > talks about a bug on the close().
>>
>> What I'm going to describe below is how things *normally* work on most
>> operating systems (think Linux or Windows) with most filesystems.  If
>> ZFS is different, and it sounds like it might be, then that's something
>> for you to discuss with Oracle.
>>
>> Normally, MMap doesn't *allocate* any memory -- so there's nothing to
>> release later.  It asks the operating system to map the file's contents
>> to a section of virtual memory, and then the program accesses that
>> memory block directly.
>>
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>>
>> A typical OS takes care of translating accesses to MMap virtual memory
>> into disk accesses, and uses available system memory to cache the data
>> that's read so a subsequent access of the same data is super fast.
>>
>> On most operating systems, memory in the disk cache is always available
>> to programs that request it for an allocation.
>>
>> ZFS uses a completely separate piece of memory for caching -- the ARC
>> cache.  I do not know if the OS is able to release memory from that
>> cache when a program requests it.  My experience with ZFS on Linux  (not
>> with Solr) suggests that the ARC cache holds onto memory a lot tighter
>> than the standard OS disk cache.  ZFS on Solaris might be a different
>> animal, though.
>>
>> I'm finding conflicting infor

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Damien Kamerman
Thanks Shawn, I'll have to look closer into this.

On 3 May 2017 at 12:10, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/2/2017 6:46 PM, Damien Kamerman wrote:
> > Shalin, yes I think it's a case of the Suggester build hitting the index
> > all at once. I'm thinking it's hitting all docs, even the ones without
> > fields relevant to the suggester.
> >
> > Shawn, I am using ZFS, though I think it's comparable to other setups.
> > mmap() should still be faster, while the ZFS ARC cache may prefer more
> > memory that other OS disk caches.
> >
> > So, it sounds like I enough memory/swap to hold the entire index. When
> will
> > the memory be released? On a commit?
> > https://lucene.apache.org/core/6_5_0/core/org/apache/
> lucene/store/MMapDirectory.html
> > talks about a bug on the close().
>
> What I'm going to describe below is how things *normally* work on most
> operating systems (think Linux or Windows) with most filesystems.  If
> ZFS is different, and it sounds like it might be, then that's something
> for you to discuss with Oracle.
>
> Normally, MMap doesn't *allocate* any memory -- so there's nothing to
> release later.  It asks the operating system to map the file's contents
> to a section of virtual memory, and then the program accesses that
> memory block directly.
>
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> A typical OS takes care of translating accesses to MMap virtual memory
> into disk accesses, and uses available system memory to cache the data
> that's read so a subsequent access of the same data is super fast.
>
> On most operating systems, memory in the disk cache is always available
> to programs that request it for an allocation.
>
> ZFS uses a completely separate piece of memory for caching -- the ARC
> cache.  I do not know if the OS is able to release memory from that
> cache when a program requests it.  My experience with ZFS on Linux  (not
> with Solr) suggests that the ARC cache holds onto memory a lot tighter
> than the standard OS disk cache.  ZFS on Solaris might be a different
> animal, though.
>
> I'm finding conflicting information regarding MMap problems on ZFS.
> Some sources say that memory usage is doubled (data in both the standard
> page cache and the arc cache), some say that this is not a general
> problem.  This is probably a question for Oracle to answer.
>
> You don't want to count swap space when looking at how much memory you
> have.  Swap performance is REALLY bad.
>
> Thanks,
> Shawn
>
>


Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Damien Kamerman
Shalin, yes I think it's a case of the Suggester build hitting the index
all at once. I'm thinking it's hitting all docs, even the ones without
fields relevant to the suggester.

Shawn, I am using ZFS, though I think it's comparable to other setups.
mmap() should still be faster, while the ZFS ARC cache may prefer more
memory that other OS disk caches.

So, it sounds like I enough memory/swap to hold the entire index. When will
the memory be released? On a commit?
https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/store/MMapDirectory.html
talks about a bug on the close().


On 2 May 2017 at 23:07, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/1/2017 10:52 PM, Damien Kamerman wrote:
> > I have a Solr v6.4.2 collection with 12 shards and 2 replicas. Each
> > replica uses about 14GB disk usage. I'm using Solaris 11 and I see the
> > 'Page cache' grow by about 7GB for each suggester replica I build. The
> > suggester index itself is very small. The 'Page cache' memory is freed
> > when the node is stopped. I guess the Suggester component is mmap'ing
> > the entire Lucene index into memory and holding it? Is this expected
> > behavior? Is there a workaround?
>
> I found the following.  The last comment on the answer, the one about
> mmap causing double-buffering with ZFS, is possibly relevant:
>
> https://serverfault.com/a/270604
>
> What filesystem are your indexes on?  If it's ZFS, it could completely
> explain the behavior.  If it's not ZFS, then the only part of it that I
> cannot explain is the fact that the page cache is freed when Solr stops.
>
> If this double-buffering actually means that the memory is allocated
> twice, then I think that ZFS is probably the wrong filesystem to run
> Solr on, unless you have a LOT of spare memory.  You could try changing
> the directory factory to one that doesn't use MMAP, but the suggester
> index factory probably cannot be easily changed.  This is too bad --
> normally MMAP is far more efficient than "standard" filesystem access.
>
> I could be reaching completely wrong conclusions based on the limited
> research I did.
>
> Thanks,
> Shawn
>
>


Suggester uses lots of 'Page cache' memory

2017-05-01 Thread Damien Kamerman
Hi all,

I have a Solr v6.4.2 collection with 12 shards and 2 replicas. Each replica
uses about 14GB disk usage. I'm using Solaris 11 and I see the 'Page cache'
grow by about 7GB for each suggester replica I build. The suggester index
itself is very small. The 'Page cache' memory is freed when the node is
stopped.

I guess the Suggester component is mmap'ing the entire Lucene index into
memory and holding it? Is this expected behavior? Is there a workaround?

I use this command to build the suggester for just the replica
'target1_shard1_replica1':
curl "
http://localhost:8983/solr/collection1/suggest?suggest.dictionary=mySuggester=true=localhost:8983/solr/target1_shard1_replica1
"

BTW: Without the 'shards' param the distributed request will randomly hit
half the replicas.

>From my solrconfig.xml:


mySuggester
AnalyzingInfixLookupFactory
mySuggester
DocumentDictionaryFactory
mySuggest
x
suggestTypeLc
false



Cheers,
Damien.


streaming expressions parallel merge

2017-04-13 Thread Damien Kamerman
Hi,

With solr streaming expressions is there a way to parallel merge a number
of solr streams. Or a way to apply the parallel function to something like
this?

merge(
   search(collection1, ...),
   search(collection2, ...),
...
   on="id asc")
)

Cheers,
Damien.


Re: block join - search together at parent and childern

2017-03-19 Thread Damien Kamerman
It can work with multiple searches. One search would return A docs, 2nd
search would return B docs etc.

Alternatively, use solr streaming expressions which have proper joins.

On 19 March 2017 at 19:58, Jan Nekuda  wrote:

> Hi Michael,
> thank you for fast answer - I have tried it, but it's not exactly what I
> need. I hope that I understood it good - the problem is that if I will
> write foo bar and foo bar is not found in root entity then it returns
> nothing even if any field in children contains foo bar.
> I need to write foo bar and find all documents where foo bar exists in
> document A OR B OR C OR D even if in A will have FOO and in e.g C will be
> bar. But if I will write bar of chocolate then I need return nothing.
>
> my idea was to use
> edismax and filter query for each word:
> http://localhost:8983/solr/demo/select?q=*:*={!parent
> which=type:root}foo*={!parent
> which=typ:root}bar*=json=true=edismax&
> qf=$allfields=true=true,
> first_country, power, name, country
>
> the problem is that I'm not able to find also parent documents in one
> condition with children.
>
> How I wrote I'm able solve it with another parent and then also doc A will
> be child and everything will work fine - but I would like to solve it
> better.
>
>
> Do you have or someone else another idea?:)
>
> Thanks
> Jan
>
>
> 2017-03-16 21:51 GMT+01:00 Mikhail Khludnev :
>
> > Hello Jan,
> >
> > What if you combine child and parent dismaxes like below
> > q={!edismax qf=$parentfields}foo bar {!parent ..}{!dismax qf=$childfields
> > v=$childclauses}=foo bar +type:child=...&
> > parentfields=...
> >
> > On Thu, Mar 16, 2017 at 10:54 PM, Jan Nekuda 
> wrote:
> >
> > > Hello Mikhail,
> > >
> > > thanks for fast answer. The problem is, that I want to have the dismax
> on
> > > child and parent together - to have the filter evaluated together.
> > >
> > > I need to have documents:
> > >
> > >
> > > path: car
> > >
> > > type:car
> > >
> > > color:red
> > >
> > > first_country: CZ
> > >
> > > name:seat
> > >
> > >
> > >
> > > path: car\engine
> > >
> > > type:engine
> > >
> > > power:63KW
> > >
> > >
> > >
> > > path: car\engine\manufacturer
> > >
> > > type:manufacturer
> > >
> > > name: xx
> > >
> > > country:PL
> > >
> > >
> > > path: car
> > >
> > > type:car
> > >
> > > color:green
> > >
> > > first_country: CZ
> > >
> > > name:skoda
> > >
> > >
> > >
> > > path: car\engine
> > >
> > > type:engine
> > >
> > > power:88KW
> > >
> > >
> > >
> > > path: car\engine\manufacturer
> > >
> > > type:manufacturer
> > >
> > > name: yy
> > >
> > > country:PL
> > >
> > >
> > > where car is parent document engine is its child a manufacturer is
> child
> > > of engine and the structure can be deep.
> > >
> > > I need to make a query with edismax over fields color, first_country,
> > > power, name, country over parent and all childern.
> > >
> > > when I ask then "seat 63 kw" i need to get seat car
> > >
> > > the same if I will write only "seat" or only "63kw" or only "xx"
> > >
> > > but if I will write "seat 88kw" i expect that i will get no result
> > >
> > > I need to return parents in which tree are all the words which I wrote
> to
> > > query.
> > >
> > > How I wrote before my solution was to split the query text and use
> q:*:*
> > > and for each /word/ in query make
> > >
> > > fq={!parent which=type:car}/word//
> > > /
> > >
> > > //and edismax with qf=color, first_country, power, name, country
> > >
> > > Thank you for your time:)
> > >
> > > Jan
> > >
> > >
> > > Dne 16.03.2017 v 20:00 Mikhail Khludnev napsal(a):
> > >
> > >
> > > Hello,
> > >>
> > >> It's hard to get into the problem. but you probably want to have
> dismax
> > on
> > >> child level:
> > >> q={!parent ...}{!edismax qf='childF1 childF2' v=$chq}=foo bar
> > >> It's usually broken because child query might match parents which is
> not
> > >> allowed. Thus, it's probably can solved by adding +type:child into
> chq.
> > >> IIRC edismax supports lucene syntax.
> > >>
> > >> On Thu, Mar 16, 2017 at 4:47 PM, Jan Nekuda 
> > wrote:
> > >>
> > >> Hi,
> > >>> I have a question for which I wasn't able to find a good solution.
> > >>> I have this structure of documents
> > >>>
> > >>> A
> > >>> |\
> > >>> | \
> > >>> B \
> > >>>   \
> > >>>C
> > >>> \
> > >>>  \
> > >>>   \
> > >>>D
> > >>>
> > >>> Document type A has fields id_number, date_from, date_to
> > >>> Document type C  has fields first_name, surname, birthdate
> > >>> Document type D AND B has fields street_name, house_number, city
> > >>>
> > >>>
> > >>> I want to find *all parents with block join and edismax*.
> > >>> The problem is that I have found that possible is find children by
> > >>> parent,
> > >>> or parent by children.
> > >>> *I want to find parent by values in parent and in children*. I want
> to
> > >>> use
> > >>> edismax with all fields from all documents (id_number, 

Re: fq performance

2017-03-18 Thread Damien Kamerman
You may want to consider a join, esp. if you're ever consider thousands of
groups. e.g.
fq={!join from=access_control_group
to=doc_group}access_control_user_id:USERID

On 18 March 2017 at 05:57, Yonik Seeley  wrote:

> On Fri, Mar 17, 2017 at 2:17 PM, Shawn Heisey  wrote:
> > On 3/17/2017 8:11 AM, Yonik Seeley wrote:
> >> For Solr 6.4, we've managed to circumvent this for filter queries and
> >> other contexts where scoring isn't needed.
> >> http://yonik.com/solr-6-4/  "More efficient filter queries"
> >
> > Nice!
> >
> > If the filter looks like the following (because q.op=AND), does it still
> > use TermsQuery?
> >
> > fq=id:(id1 OR id2 OR id3 OR ... id2000)
>
> Yep, that works as well.  As does fq=id:id1 OR id:id2 OR id:id3 ...
> Was implemented here: https://issues.apache.org/jira/browse/SOLR-9786
>
> -Yonik
>


Re: Finding time of last commit to index from SolrJ?

2017-03-16 Thread Damien Kamerman
I ended up doing something like this:

String core = "collection1_shard1_core1";
ModifiableSolrParams p = new ModifiableSolrParams();
p.set("show", "index");
GenericSolrRequest checkRequest = new GenericSolrRequest(POST, "/../" +
core + "/admin/luke", p);
NamedList checkResult = client.request("collection1", checkRequest);

On 16 March 2017 at 14:20, Phil Scadden  wrote:

> The admin gui displays the time of last commit to a core but how can this
> be queried from within SolrJ?
>
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>


SQL JOIN eta

2017-03-14 Thread Damien Kamerman
Hi all, does anyone know roughly when the SQL JOIN functionally will be
released? Is there a Jira for this? I'm guessing this might be on Solr 6.6.

Cheers,
Damien.


Re: Data Import Handler, also "Real Time" index updates

2017-03-05 Thread Damien Kamerman
You could configure the dataimporthandler to not delete at the start
(either do a delta or set the preimportdeltequery), and set a
postimportdeletequery if required.

On Saturday, 4 March 2017, Alexandre Rafalovitch  wrote:

> Commit is index global. So if you have overlapping timelines and commit is
> issued, it will affect all changes done to that point.
>
> So, the aliases may be better for you. You could potentially also reload a
> cure with changes solrconfig.XML settings, but that's heavy on caches.
>
> Regards,
>Alex
>
> On 3 Mar 2017 1:21 PM, "Sales"  >
> wrote:
>
>
> >
> > You have indicated that you have a way to avoid doing updates during the
> > full import.  Because of this, you do have another option that is likely
> > much easier for you to implement:  Set the "commitWithin" parameter on
> > each update request.  This works almost identically to autoSoftCommit,
> > but only after a request is made.  As long as there are never any of
> > these updates during a full import, these commits cannot affect that
> import.
>
> I had attempted at least to say that there may be a few updates that happen
> at the start of an import, so, they are while an import is happening just
> due to timing issues. Those will be detected, and, re-executed once the
> import is done though. But my question here is if the update is using
> commitWithin, then, does that only affect those updates that have the
> parameter, or, does it then also soft commit the in progress import? I
> cannot guarantee that zero updates will be done as there is a timing issue
> at the very start of the import, so, a few could cross over.
>
> Adding commitWithin is fine. Just want to make sure those that might
> execute for the first few seconds of an import don’t kill anything.
> >
> > No matter what is happening, you should have autoCommit (not
> > autoSoftCommit) configured with openSearcher set to false.  This will
> > ensure transaction log rollover, without affecting change visibility.  I
> > recommend a maxTime of one to five minutes for this.  You'll see 15
> > seconds as the recommended value in many places.
> >
> > https://lucidworks.com/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/ <
> https://lucidworks.com/2013/08/23/understanding-
> transaction-logs-softcommit-
> and-commit-in-sorlcloud/>
>
> Oh, we are fine with much longer, does not have to be instant. 10-15
> minutes would be fine.
>
> >
> > Thanks
> > Shawn
> >
>


Re: Implicit routing, delete on specific shard

2017-03-01 Thread Damien Kamerman
I assume with the implicit router you would do something like curl "
http://127.0.0.1:8983/solr/collection1_20170220_replica1/update?commit=
false"

On 28 February 2017 at 22:39, philippa griggs  wrote:

> Hello,
>
>
> Solr 5.4.1 using Solr Cloud, multiple cores with two cores per shard.
> Zookeeper 3.4.6   (5 zookeeper ensemble).
>
> We use an implicit router and split shards into weeks. Every now and again
> I need to run a delete on the system.  I do this by running the following
> command on one of the instances.
>
> curl http://127.0.0.1:8983/solr/collection1/update/?commit=false -H
> "Content-Type: text/xml" -d "XXX"
>
>
> Is there anyway of specifying the shards to run the delete on, instead of
> running it against the whole collection? I will always know what shards the
> sessions I want to delete will be on.
>
> I know when you query, you can do something like this:
>
> http://XXX:8983/solr/collection1/select?q=*%3A*=json=true=
> 20170220
>
> Is there similar function with the delete?
>
> Something like:
>
> curl http://127.0.0.1:8983/solr/collection1/update/?commit=false -H
> "Content-Type: text/xml" -d "XXX" -shard
> "20170220"
>
> Many thanks
>
> Philippa
>
>


6.4.0 Realtime get can't find shard

2017-02-05 Thread Damien Kamerman
Hi,

I have a collection with 12 shards using the compositeId router and find
that the real-time get always throws an exception 'Can't find shard'. The
get works if I specify the exact /collection_shardX_replica1. I specify the
'id' and '_route_' params, for example /collection/get?id=test&_route_=test1

org.apache.solr.common.SolrException;
null:org.apache.solr.common.SolrException: Can't find shard
'ip_0_2017_q1_shard3'
at
org.apache.solr.handler.component.RealTimeGetComponent.sliceToShards(RealTimeGetComponent.java:536)
at
org.apache.solr.handler.component.RealTimeGetComponent.createSubRequests(RealTimeGetComponent.java:483)
at
org.apache.solr.handler.component.RealTimeGetComponent.distributedProcess(RealTimeGetComponent.java:439)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:345)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)

I assume this is a bug.

Cheers,
Damien.


Re: Migrate Documents to Another Collection

2017-02-05 Thread Damien Kamerman
Try with split.key=!
This will migrate all docs.

On 6 February 2017 at 14:31, alias <524839...@qq.com> wrote:

> I use this command,but not effect:  http://localhost:8081/solr/
> admin/collections?action=MIGRATE=c1
> key=c1_=c2,
>
>
>
> -- 原始邮件 --
> 发件人: "Erick Erickson";
> 发送时间: 2017年2月6日(星期一) 中午11:24
> 收件人: "solr-user";
> 主题: Re: Migrate Documents to Another Collection
>
>
>
> I do not understand the problem. The http command you've shown is
> simply creating a collection. What does that have to do with migrating
> from c1 to c2? In fact, it is recreating c1. Assuming you haven't
> redefined the , then the router field defaults to "id". So
> this is just recreating a plain-vanilla collection. If you already
> have a "c1" collection, I expect the above would simply fail.
>
> There's nothing indicating you tried to move docs from C1 to C2 as the
> title of this e-mail indicates. There's no magic here, Solr
> collections are completely independent of one another. If you want
> docs to be moved, _you_ have to move them.
>
> Best,
> Erick
>
>
>
> On Sun, Feb 5, 2017 at 6:20 PM, alias <524839...@qq.com> wrote:
> > hello,please help me look this question
> >
> >
> >
> > I have a solr collections on the merger, we would like to ask the next
> question is as follows I have two collections, c1 and c2,
> >
> > C1 colleciton there are 10 data, id is from c1_0 to c1_9,
> >
> > C2 colleciton also has 10 data, id is from c2_0 to c2_9,
> >
> > I now want to c1 id c1_ format data into the c2, I implemented the
> following order, it seems no effect, and why?
> >
> > I c1 designated in the new router.field=id
> >
> > http://localhost:8081/solr/admin/collections?action=
> CREATEname=c1numShards=3=3=3&
> collection.configName=myconfrouter.field=id
> >
> > I refer to https://cwiki.apache.org/confluence/display/solr/
> Collections+API#CollectionsAPI-api12
> >
> > Solr version 6.3.0
> >
> > I have a problem? Or understanding wrong?
>


Re: backward compatibility of Solr 6.3 version with old Sol4j clients

2017-02-05 Thread Damien Kamerman
Regarding CloudSolrServer, I tried this briefly and found the client was
only aware of the old shared clusterstate.json.

On 4 February 2017 at 06:23, Shawn Heisey  wrote:

> On 2/3/2017 10:12 AM, Suresh Pendap wrote:
> > Will Solrj client 4.10.3 version work with Solr 6.3 version of the
> > server? I was trying to look up the documentation but no where the
> > compatibility matrix between server and client is provided. Has some
> > one already used this combination?
>
> If it's HttpSolrServer (HttpSolrClient in newer versions), chances are
> good that it will work.  The basic http API in Solr does not change
> quickly.  If you run into problems, provide detailed information here or
> on the IRC channel and we'll try to help you work through them.
>
> I have done quite a bit of version mismatching with the http client.
> Currently I have code with the 6.x client that connects to 4.x, 5.x, and
> 6.x servers.  I have also used older clients with newer servers and had
> no issues.
>
> If it's CloudSolrServer (CloudSolrClient in newer versions), I wouldn't
> even try to make it work with that wide a version discrepancy.
> SolrCloud has evolved so rapidly over the last couple of years that
> connecting different client and server versions may not work at all.
> For best compatibility, they should be identical versions.  If they
> aren't, SolrJ should be newer than Solr, be from the same major version,
> and not be offset by more than one or two minor releases.
>
> Thanks,
> Shawn
>
>


Re: Facet Null Pointer Exception with upgraded indexes

2017-01-02 Thread Damien Kamerman
Try docValues="false" in your v6 schema.xml. You will may need to upgrade
again.

On 31 December 2016 at 07:59, Mikhail Khludnev  wrote:

> Hi Andreu,
>
> I think it can't facet text field anymore per se
> https://issues.apache.org/jira/browse/SOLR-8362.
>
> On Fri, Dec 30, 2016 at 5:07 PM, Andreu Marimon  wrote:
>
> > Hi,
> >
> > I'm trying to update from solr 4.3 to 6.3. We are doing a two step
> > migration and, during the first step, we upgraded the indexes from 4.3 to
> > 5.5, which is the newest version we can get without errors using the
> Lucene
> > IndexUpgrader tool. As far as I know, 6.30 should be able to read indexes
> > generated with 5.5.
> >
> > Our problem is that, despite loading the cores and data correctly, every
> > query returns a NullPointerException during the facet counting. We can
> get
> > the results anyways, but the facets are not properly set and this error
> > appears in the response:
> >
> > "error":{
> > "metadata":[
> >   "error-class","org.apache.solr.common.SolrException",
> >   "root-error-class","java.lang.NullPointerException"],
> > "msg":"Exception during facet.field: fa_source",
> > "trace":"org.apache.solr.common.SolrException: Exception during
> > facet.field: fa_source\n\tat
> > org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(
> > SimpleFacets.java:766)\n\tat
> > java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat
> > org.apache.solr.request.SimpleFacets$2.execute(
> > SimpleFacets.java:699)\n\tat
> > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(
> > SimpleFacets.java:775)\n\tat
> > org.apache.solr.handler.component.FacetComponent.
> > getFacetCounts(FacetComponent.java:321)\n\tat
> > org.apache.solr.handler.component.FacetComponent.
> > process(FacetComponent.java:265)\n\tat
> > org.apache.solr.handler.component.SearchHandler.handleRequestBody(
> > SearchHandler.java:295)\n\tat
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> > RequestHandlerBase.java:153)\n\tat
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)\n\tat
> > org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.java:654)\n\tat
> > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)\n\tat
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:303)\n\tat
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:254)\n\tat
> > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> > doFilter(ServletHandler.java:1668)\n\tat
> > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> > ServletHandler.java:581)\n\tat
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> > ScopedHandler.java:143)\n\tat
> > org.eclipse.jetty.security.SecurityHandler.handle(
> > SecurityHandler.java:548)\n\tat
> > org.eclipse.jetty.server.session.SessionHandler.
> > doHandle(SessionHandler.java:226)\n\tat
> > org.eclipse.jetty.server.handler.ContextHandler.
> > doHandle(ContextHandler.java:1160)\n\tat
> > org.eclipse.jetty.servlet.ServletHandler.doScope(
> > ServletHandler.java:511)\n\tat
> > org.eclipse.jetty.server.session.SessionHandler.
> > doScope(SessionHandler.java:185)\n\tat
> > org.eclipse.jetty.server.handler.ContextHandler.
> > doScope(ContextHandler.java:1092)\n\tat
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> > ScopedHandler.java:141)\n\tat
> > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> > ContextHandlerCollection.java:213)\n\tat
> > org.eclipse.jetty.server.handler.HandlerCollection.
> > handle(HandlerCollection.java:119)\n\tat
> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> > HandlerWrapper.java:134)\n\tat
> > org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat
> > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat
> > org.eclipse.jetty.server.HttpConnection.onFillable(
> > HttpConnection.java:244)\n\tat
> > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> > AbstractConnection.java:273)\n\tat
> > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat
> > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> > SelectChannelEndPoint.java:93)\n\tat
> > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> > produceAndRun(ExecuteProduceConsume.java:246)\n\tat
> > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> > ExecuteProduceConsume.java:156)\n\tat
> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> > QueuedThreadPool.java:654)\n\tat
> > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
> > QueuedThreadPool.java:572)\n\tat
> > java.lang.Thread.run(Thread.java:745)\nCaused by:
> > java.lang.NullPointerException\n\tat
> > org.apache.solr.request.DocValuesFacets.getCounts(
> > DocValuesFacets.java:117)\n\tat
> > org.apache.solr.request.SimpleFacets.getTermCounts(
> > SimpleFacets.java:530)\n\tat
> > org.apache.solr.request.SimpleFacets.getTermCounts(
> > 

Re: Reindex after schema change options

2016-12-01 Thread Damien Kamerman
I'm in a similar situation where I'm
using org.apache.lucene.index.IndexUpgrader to upgrade an index from solr 4
to solr 6, and want to add docValues to the schema.

All my fields are stored so I assume I could use the DataImportHandler
SolrEntityProcessor to copy the collection to a new collection and pick up
the docValues that way.

Will this work and is there a better (command line) way?

Regards,
Damien

On 29 October 2016 at 09:50, Erick Erickson  wrote:

> This is a little contradictory:
>
> > how do I reindex the data in place - without starting from the source?
>
> > then ran my reindex SolrJ code.
>
> So it looks like you _were_ able to re-index from scratch?
>
> BTW, to be absolutely safe I'd re-index to a _new_ collection and
> then, perhaps, use
> collection aliasing to switch seamlessly. I've seen situations where
> when some segments think fieldX has docValues and some don't it can be
> a problem.
>
> OTOH, if you define a _new_ field with docValues, that's no problem.
>
> And if it worked for you, it worked.
>
> Best,
> Erick
>
> On Fri, Oct 28, 2016 at 8:09 AM, tedsolr  wrote:
> > So I ran a quick test of my idea and it worked. I modified the schema.xml
> > file - uploaded it to ZK - reloaded the collection - then ran my reindex
> > SolrJ code. After it completed the schema browser in the admin console
> shows
> > that the field uses docValues. I tried a streaming expression on it to
> using
> > the /export request handler and that was good - no errors.
> >
> > Still would love to hear from anyone who has done this differently.
> >
> >
> > tedsolr wrote
> >> Not all my fields use docValues. This is going to be a problem in the
> >> future. Once I change the schema.xml to use docValues for these certain
> >> field types, how do I reindex the data in place - without starting from
> >> the source?
> >>
> >> I'm aware of lucene's IndexUpgrader but that will only ensure a correct
> >> lucene match version. I'm not changing that. Could I use SolrJ to walk
> >> through the documents and "touch" each one and do an atomic update on
> the
> >> fields that have changed? (all the fields I care about are stored)
> >>
> >> Thanks, Ted
> >> v5.2.1
> >
> >
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/Reindex-after-schema-change-options-tp4303395p4303510.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr 6.3.0 SQL question

2016-11-28 Thread Damien Kamerman
Aggregated selects only work with lower-case collection names (and no
dashes). (Bug in StatsStream I think)

I assume 'SOLR-9077 Streaming expressions should support collection alias'
which is fixed in 6.4 is a work around.

On 29 November 2016 at 08:29, Kevin Risden  wrote:

> Is there a longer error/stack trace in your Solr server logs? I wonder if
> the real error is being masked.
>
> Kevin Risden
>
> On Mon, Nov 28, 2016 at 3:24 PM, Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
>
> > I'm running this query:
> >
> > curl --data-urlencode 'stmt=SELECT avg(TextSize) from UNCLASS'
> > http://cordelia:9100/solr/UNCLASS/sql?aggregationMode=map_reduce
> >
> > The error that I get back is:
> >
> > {"result-set":{"docs":[
> > {"EXCEPTION":"org.apache.solr.common.SolrException: Collection not
> found:
> > unclass","EOF":true,"RESPONSE_TIME":2}]}}
> >
> > TextSize is defined as:
> >  > indexed="true" stored="true"/>
> >
> > This query works fine:
> > curl --data-urlencode 'stmt=SELECT TextSize from UNCLASS'
> > http://cordelia:9100/solr/UNCLASS/sql?aggregationMode=map_reduce
> >
> > Any idea what I'm doing wrong?
> > Thank you!
> >
> > -Joe
> >
> >
>


JDBC: Collection not found with count(*) and uppercase name

2016-07-17 Thread Damien Kamerman
Hi,

I'm on Solr 6.1 and testing a JDBC query from SquirrelSQL and I find this
query works OK:
select id from c_D02016

But when I try this query I get an error: Collection not found c_d02016
select count(*) from c_D02016.

It seems solr is expecting the collection/table name to be lower-case. Has
anyone else seen this?

Here's the full log from the server:
ERROR - 2016-07-18 13:46:23.711; [c:ip_0 s:shard1 r:core_node1
x:c_0_shard1_replica1] org.apache.solr.common.SolrException;
java.io.IOException: org.apache.solr.common.SolrException: Collection not
found: c_d02016
at
org.apache.solr.client.solrj.io.stream.StatsStream.open(StatsStream.java:221)
at
org.apache.solr.handler.SQLHandler$MetadataStream.open(SQLHandler.java:1578)
at
org.apache.solr.client.solrj.io.stream.ExceptionStream.open(ExceptionStream.java:51)
at
org.apache.solr.handler.StreamHandler$TimerStream.open(StreamHandler.java:423)
at
org.apache.solr.response.TextResponseWriter.writeTupleStream(TextResponseWriter.java:304)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:168)
at
org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:183)
at
org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:299)
at
org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:95)
at
org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:60)
at
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:731)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:473)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:318)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Collection not found:
ip_tiger_d02016-00
at
org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(CloudSolrClient.java:1248)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:961)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:934)
at
org.apache.solr.client.solrj.io.stream.StatsStream.open(StatsStream.java:218)
... 40 more

Regards,
Damien.


Re: SOLR-7191 SolrCloud 5 with thousands of collections

2015-10-19 Thread Damien Kamerman
OK, turned out ZkStateReader.constructState() was only calling
ClusterState.getCollections()
for log.debug(). I removed that and the next bottleneck is talking
to ZkStateReader.fetchCollectionState.

"coreZkRegister-4-thread-14-processing-n:ftet1:8003_solr
x:t_1558_shard1_replica1 s:shard1 c:t_1558 r:core_node1" #151 prio=5
os_prio=64 tid=0x05568800 nid=0xc8 in Object.wait()
[0x7fefb117c000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
- locked <0x7fff50fadf70> (a
org.apache.zookeeper.ClientCnxn$Packet)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:353)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:350)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
at
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:350)
at
org.apache.solr.common.cloud.ZkStateReader.fetchCollectionState(ZkStateReader.java:1029)
at
org.apache.solr.common.cloud.ZkStateReader.updateClusterState(ZkStateReader.java:260)
- locked <0x7ff040b92270> (a
org.apache.solr.common.cloud.ZkStateReader)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:979)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:881)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:184)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


On 19 October 2015 at 15:59, Damien Kamerman <dami...@gmail.com> wrote:

> Hi All,
>
> I've had a first look at porting the patch I did for SOLR-7191 (SolrCloud
> with thousands of collections) in Solr 4.10 to the Solr trunk (1708905).
> Now I created 6,000 collections (3 nodes; 2 x replicas) and re-started the
> 3 nodes. What I noticed is that the cloud is starting but slowly. All the 
> org.apache.solr.core.CoreContainer.create()
> threads are blocked in the ZkStateReader. I was hoping the changes to
> clusterstate.json from global to per collection would reduce the
> contention. Comments appreciated.
>
> example jstacks:
> "coreLoadExecutor-6-thread-24-processing-n:ftet1:8003_solr" #70 prio=5
> os_prio=64 tid=0x00bcd800 nid=0x88 waiting for monitor entry
> [0x7fefb29bc000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.solr.common.cloud.ZkStateReader.addCollectionWatch(ZkStateReader.java:1048)
> - waiting to lock <0x7ff0403ff020> (a
> org.apache.solr.common.cloud.ZkStateReader)
> at
> org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1561)
> at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:726)
> at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:451)
> at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:442)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> "zkCallback-4-thread-80-processing-n:ftet1:8003_solr" #268 prio=5
> os_prio=64 tid=0x02ee nid=0x134 in Object.wait()
> [0x7fefaed2d000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> - locked <0x7ff0be17e600> (a
> org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
> at
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:353)
> at
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:350)
> at
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
> at
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:350)
> at
> org.apache.so

SOLR-7191 SolrCloud 5 with thousands of collections

2015-10-18 Thread Damien Kamerman
Hi All,

I've had a first look at porting the patch I did for SOLR-7191 (SolrCloud
with thousands of collections) in Solr 4.10 to the Solr trunk (1708905).
Now I created 6,000 collections (3 nodes; 2 x replicas) and re-started the
3 nodes. What I noticed is that the cloud is starting but slowly. All
the org.apache.solr.core.CoreContainer.create()
threads are blocked in the ZkStateReader. I was hoping the changes to
clusterstate.json from global to per collection would reduce the
contention. Comments appreciated.

example jstacks:
"coreLoadExecutor-6-thread-24-processing-n:ftet1:8003_solr" #70 prio=5
os_prio=64 tid=0x00bcd800 nid=0x88 waiting for monitor entry
[0x7fefb29bc000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.solr.common.cloud.ZkStateReader.addCollectionWatch(ZkStateReader.java:1048)
- waiting to lock <0x7ff0403ff020> (a
org.apache.solr.common.cloud.ZkStateReader)
at
org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1561)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:726)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:451)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:442)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"zkCallback-4-thread-80-processing-n:ftet1:8003_solr" #268 prio=5
os_prio=64 tid=0x02ee nid=0x134 in Object.wait()
[0x7fefaed2d000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
- locked <0x7ff0be17e600> (a
org.apache.zookeeper.ClientCnxn$Packet)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:353)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:350)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
at
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:350)
at
org.apache.solr.common.cloud.ZkStateReader.fetchCollectionState(ZkStateReader.java:1030)
at
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:1015)
at
org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:550)
at
org.apache.solr.common.cloud.ClusterState.getCollections(ClusterState.java:207)
at
org.apache.solr.common.cloud.ZkStateReader.constructState(ZkStateReader.java:462)
at
org.apache.solr.common.cloud.ZkStateReader.access$600(ZkStateReader.java:57)
at
org.apache.solr.common.cloud.ZkStateReader$StateWatcher.process(ZkStateReader.java:864)
- locked <0x7ff0403ff020> (a
org.apache.solr.common.cloud.ZkStateReader)
at
org.apache.solr.common.cloud.SolrZkClient$3$1.run(SolrZkClient.java:269)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Re: rough maximum cores (shards) per machine?

2015-03-25 Thread Damien Kamerman
I've tried (very simplistically) hitting a collection with a good variety
of searches and looking at the collection's heap memory and working out the
bytes / doc. I've seen results around 100 bytes / doc, and as low as 3
bytes / doc for collections with small docs. It's still a work-in-progress
- not sure if it will scale with docs - or is too simplistic.

On 25 March 2015 at 17:49, Shai Erera ser...@gmail.com wrote:

 While it's hard to answer this question because as others have said, it
 depends, I think it will be good of we can quantify or assess the cost of
 running a SolrCore.

 For instance, let's say that a server can handle a load of 10M indexed
 documents (I omit search load on purpose for now) in a single SolrCore.
 Would the same server be able to handle the same number of documents, If we
 indexed 1000 docs per SolrCore, in total of 10,000 SorClores? If the answer
 is no, then it means there is some cost that comes w/ each SolrCore, and we
 may at least be able to give an upper bound --- on a server with X amount
 of storage, Y GB RAM and Z cores you can run up to maxSolrCores(X, Y, Z).

 Another way to look at it, if I were to create empty SolrCores, would I be
 able to create an infinite number of cores if storage was infinite? Or even
 empty cores have their toll on CPU and RAM?

 I know from the Lucene side of things that each SolrCore (carries a Lucene
 index) there is a toll to an index -- the lexicon, IW's RAM buffer, Codecs
 that store things in memory etc. For instance, one downside of splitting a
 10M core into 10,000 cores is that the cost of the holding the total
 lexicon (dictionary of indexed words) goes up drastically, since now every
 word (just the byte[] of the word) is potentially represented in memory
 10,000 times.

 What other RAM/CPU/Storage costs does a SolrCore carry with it? There are
 the caches of course, which really depend on how many documents are
 indexed. Any other non-trivial or constant cost?

 So yes, there isn't a single answer to this question. It's just like
 someone would ask how many documents can a single Lucene index handle
 efficiently. But if we can come up with basic numbers as I outlined above,
 it might help people doing rough estimates. That doesn't mean people
 shouldn't benchmark, as that upper bound may be wy too high for their
 data set, query workload and search needs.

 Shai

 On Wed, Mar 25, 2015 at 5:25 AM, Damien Kamerman dami...@gmail.com
 wrote:

  From my experience on a high-end sever (256GB memory, 40 core CPU)
 testing
  collection numbers with one shard and two replicas, the maximum that
 would
  work is 3,000 cores (1,500 collections). I'd recommend much less (perhaps
  half of that), depending on your startup-time requirements. (Though I
 have
  settled on 6,000 collection maximum with some patching. See SOLR-7191).
 You
  could create multiple clouds after that, and choose the cloud least used
 to
  create your collection.
 
  Regarding memory usage I'd pencil in 6MB overheard (no docs) java heap
 per
  collection.
 
  On 25 March 2015 at 13:46, Ian Rose ianr...@fullstory.com wrote:
 
   First off thanks everyone for the very useful replies thus far.
  
   Shawn - thanks for the list of items to check.  #1 and #2 should be
 fine
   for us and I'll check our ulimit for #3.
  
   To add a bit of clarification, we are indeed using SolrCloud.  Our
  current
   setup is to create a new collection for each customer.  For now we
 allow
   SolrCloud to decide for itself where to locate the initial shard(s) but
  in
   time we expect to refine this such that our system will automatically
   choose the least loaded nodes according to some metric(s).
  
   Having more than one business entity controlling the configuration of a
single (Solr) server is a recipe for disaster. Solr works well if
 there
   is
an architect for the system.
  
  
   Jack, can you explain a bit what you mean here?  It looks like Toke
  caught
   your meaning but I'm afraid it missed me.  What do you mean by
 business
   entity?  Is your concern that with automatic creation of collections
  they
   will be distributed willy-nilly across the cluster, leading to uneven
  load
   across nodes?  If it is relevant, the schema and solrconfig are
  controlled
   entirely by me and is the same for all collections.  Thus theoretically
  we
   could actually just use one single collection for all of our customers
   (adding a 'customer:whatever' type fq to all queries) but since we
  never
   need to query across customers it seemed more performant (as well as
  safer
   - less chance of accidentally leaking data across customers) to use
   separate collections.
  
   Better to give each tenant a separate Solr instance that you spin up
 and
spin down based on demand.
  
  
   Regarding this, if by tenant you mean customer, this is not viable
 for
  us
   from a cost perspective.  As I mentioned initially, many of our
 customers
   are very small so

Re: rough maximum cores (shards) per machine?

2015-03-24 Thread Damien Kamerman
From my experience on a high-end sever (256GB memory, 40 core CPU) testing
collection numbers with one shard and two replicas, the maximum that would
work is 3,000 cores (1,500 collections). I'd recommend much less (perhaps
half of that), depending on your startup-time requirements. (Though I have
settled on 6,000 collection maximum with some patching. See SOLR-7191). You
could create multiple clouds after that, and choose the cloud least used to
create your collection.

Regarding memory usage I'd pencil in 6MB overheard (no docs) java heap per
collection.

On 25 March 2015 at 13:46, Ian Rose ianr...@fullstory.com wrote:

 First off thanks everyone for the very useful replies thus far.

 Shawn - thanks for the list of items to check.  #1 and #2 should be fine
 for us and I'll check our ulimit for #3.

 To add a bit of clarification, we are indeed using SolrCloud.  Our current
 setup is to create a new collection for each customer.  For now we allow
 SolrCloud to decide for itself where to locate the initial shard(s) but in
 time we expect to refine this such that our system will automatically
 choose the least loaded nodes according to some metric(s).

 Having more than one business entity controlling the configuration of a
  single (Solr) server is a recipe for disaster. Solr works well if there
 is
  an architect for the system.


 Jack, can you explain a bit what you mean here?  It looks like Toke caught
 your meaning but I'm afraid it missed me.  What do you mean by business
 entity?  Is your concern that with automatic creation of collections they
 will be distributed willy-nilly across the cluster, leading to uneven load
 across nodes?  If it is relevant, the schema and solrconfig are controlled
 entirely by me and is the same for all collections.  Thus theoretically we
 could actually just use one single collection for all of our customers
 (adding a 'customer:whatever' type fq to all queries) but since we never
 need to query across customers it seemed more performant (as well as safer
 - less chance of accidentally leaking data across customers) to use
 separate collections.

 Better to give each tenant a separate Solr instance that you spin up and
  spin down based on demand.


 Regarding this, if by tenant you mean customer, this is not viable for us
 from a cost perspective.  As I mentioned initially, many of our customers
 are very small so dedicating an entire machine to each of them would not be
 economical (or efficient).  Or perhaps I am not understanding what your
 definition of tenant is?

 Cheers,
 Ian



 On Tue, Mar 24, 2015 at 4:51 PM, Toke Eskildsen t...@statsbiblioteket.dk
 wrote:

  Jack Krupansky [jack.krupan...@gmail.com] wrote:
   I'm sure that I am quite unqualified to describe his hypothetical
 setup.
  I
   mean, he's the one using the term multi-tenancy, so it's for him to be
   clear.
 
  It was my understanding that Ian used them interchangeably, but of course
  Ian it the only one that knows.
 
   For me, it's a question of who has control over the config and schema
 and
   collection creation. Having more than one business entity controlling
 the
   configuration of a single (Solr) server is a recipe for disaster.
 
  Thank you. Now your post makes a lot more sense. I will not argue against
  that.
 
  - Toke Eskildsen
 




-- 
Damien Kamerman


Re: Unable to index rich-text documents in Solr Cloud

2015-03-18 Thread Damien Kamerman
I suggest you check your solr logs for more info as to the cause.

On 19 March 2015 at 12:58, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:

 Hi Erick,

 No, the PDF file is a testing file which only contains 1 sentence.

 I've managed to get it to work by removing startup=lazy in
 the ExtractingRequestHandler and added the following lines:
   str name=uprefixignored_/str
   str name=captureAttrtrue/str
   str name=fmap.alinks/str
   str name=fmap.divignored_/str

 Does the presence of startup=lazy affect the function of
 ExtractingRequestHandler , or is it one of the str name values?

 Regards,
 Edwin


 On 18 March 2015 at 23:19, Erick Erickson erickerick...@gmail.com wrote:

  Shot in the dark, but is the PDF file significantly larger than the
  others? Perhaps your simply exceeding the packet limits for the
  servlet container?
 
  Best,
  Erick
 
  On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo
  edwinye...@gmail.com wrote:
   Hi everyone,
  
   I'm having some issues with indexing rich-text documents from the Solr
   Cloud. When I tried to index a pdf or word document, I get the
 following
   error:
  
  
   org.apache.solr.common.SolrException: Bad Request
  
  
  
   request:
 
 http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2Fwt=javabinversion=2
   at
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
  Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
  Source)
   at java.lang.Thread.run(Unknown Source)
  
  
   I'm able to index .xml and .csv files in Solr Cloud with the same
  configuration.
  
   I have setup Solr Cloud using the default zookeeper in Solr 5.0.0, and
   I have 2 shards with the following details:
   Shard1: 192.168.2.2:8983
   Shard2: 192.168.2.2:8984
  
   Prior to this, I'm already able to index rich-text documents without
   the Solr Cloud, and I'm using the same solrconfig.xml and schema.xml,
   so my ExtractRequestHandler is already defined.
  
   Is there other settings required in order to index rich-text documents
   in Solr Cloud?
  
  
   Regards,
   Edwin
 




-- 
Damien Kamerman


Re: Unable to index rich-text documents in Solr Cloud

2015-03-18 Thread Damien Kamerman
It sounds like https://issues.apache.org/jira/browse/SOLR-5551
Have you checked the solr.log for all nodes?

On 19 March 2015 at 14:43, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:

 This is the logs that I got from solr.log. I can't seems to figure out
 what's wrong with it. Does anyone knows?



 ERROR - 2015-03-18 15:06:51.019;
 org.apache.solr.update.StreamingSolrClients$1; error
 org.apache.solr.common.SolrException: Bad Request



 request:

 http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2Fwt=javabinversion=2
 
 http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F192.168.23.72%3A8983%2Fsolr%2Flogmill%2Fwt=javabinversion=2
 
 at

 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 INFO  - 2015-03-18 15:06:51.019;
 org.apache.solr.update.processor.LogUpdateProcessor; [logmill] webapp=/solr
 path=/update/extract params={literal.id
 =C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdfresource.name
 =C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdf}
 {add=[C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdf]} 0 1252
 INFO  - 2015-03-18 15:06:51.029;
 org.apache.solr.update.DirectUpdateHandler2; start

 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
 INFO  - 2015-03-18 15:06:51.029;
 org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
 Skipping IW.commit.
 INFO  - 2015-03-18 15:06:51.029; org.apache.solr.core.SolrCore;
 SolrIndexSearcher has not changed - not re-opening:
 org.apache.solr.search.SolrIndexSearcher
 INFO  - 2015-03-18 15:06:51.039;
 org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
 INFO  - 2015-03-18 15:06:51.039;
 org.apache.solr.update.processor.LogUpdateProcessor; [logmill] webapp=/solr
 path=/update params={waitSearcher=truedistrib.from=

 http://192.168.2.2:8983/solr/logmill/update.distrib=FROMLEADERopenSearcher=truecommit=truewt=javabinexpungeDeletes=falsecommit_end_point=trueversion=2softCommit=false
 }
 {commit=} 0 10
 INFO  - 2015-03-18 15:06:51.039;
 org.apache.solr.update.processor.LogUpdateProcessor; [logmill] webapp=/solr
 path=/update params={commit=true} {commit=} 0 10



 Regards,
 Edwin


 On 19 March 2015 at 10:56, Damien Kamerman dami...@gmail.com wrote:

  I suggest you check your solr logs for more info as to the cause.
 
  On 19 March 2015 at 12:58, Zheng Lin Edwin Yeo edwinye...@gmail.com
  wrote:
 
   Hi Erick,
  
   No, the PDF file is a testing file which only contains 1 sentence.
  
   I've managed to get it to work by removing startup=lazy in
   the ExtractingRequestHandler and added the following lines:
 str name=uprefixignored_/str
 str name=captureAttrtrue/str
 str name=fmap.alinks/str
 str name=fmap.divignored_/str
  
   Does the presence of startup=lazy affect the function of
   ExtractingRequestHandler , or is it one of the str name values?
  
   Regards,
   Edwin
  
  
   On 18 March 2015 at 23:19, Erick Erickson erickerick...@gmail.com
  wrote:
  
Shot in the dark, but is the PDF file significantly larger than the
others? Perhaps your simply exceeding the packet limits for the
servlet container?
   
Best,
Erick
   
On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo
edwinye...@gmail.com wrote:
 Hi everyone,

 I'm having some issues with indexing rich-text documents from the
  Solr
 Cloud. When I tried to index a pdf or word document, I get the
   following
 error:


 org.apache.solr.common.SolrException: Bad Request



 request:
   
  
 
 http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2Fwt=javabinversion=2
 at
   
  
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
 at java.lang.Thread.run(Unknown Source)


 I'm able to index .xml and .csv files in Solr Cloud with the same
configuration.

 I have setup Solr Cloud using the default zookeeper in Solr 5.0.0,
  and
 I have 2 shards with the following details:
 Shard1: 192.168.2.2:8983
 Shard2: 192.168.2.2:8984

 Prior to this, I'm already able to index rich-text documents
 without
 the Solr Cloud, and I'm using the same solrconfig.xml and
 schema.xml,
 so my ExtractRequestHandler is already defined.

 Is there other

Re: backport Heliosearch features to Solr

2015-03-15 Thread Damien Kamerman
Sounds like 64bit is OK. Would be work re-testing that G1GC assert trip
with the latest JDK.

On 14 March 2015 at 01:46, Shawn Heisey apa...@elyograg.org wrote:

 On 3/12/2015 5:11 PM, Markus Jelsma wrote:
  Hello - i would assume off-heap would out perform any heap based data
 structure. G1 is only useful if you deal with very large heaps, and it eats
 CPU at the same time. As much as G1 is better than CMS in same cases, you
 would still have less wasted CPU time and resp. less STW events.
 
  Anyway. if someone has a setup at hand to provide details, please do :)

 I do not have any info about CPU usage with G1 vs. CMS.  My Solr servers
 are extremely lightly loaded so CPU usage is never a problem.  For very
 busy servers, that could be an issue, and CMS is probably the way to go.

 I have been able to reduce GC times (both the average time and the long
 collection time) by a large margin using G1 vs. CMS, with the help of
 the hotspot-gc-use mailing list maintained by the OpenJDK project.

 I keep my mad ramblings about GC tuning on the Solr wiki:

 http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

 The only caveat that I have is that the Lucene project has said never to
 use G1GC with Lucene.  Their exact words, found on the wiki URL below,
 are: Do not, under any circumstances, run Lucene with the G1 garbage
 collector. 

 I can say that I have never had a problem with G1, but if you choose to
 use it, you should know that you are going against the advice of the
 people who write the code:


 http://wiki.apache.org/lucene-java/JavaBugs#Oracle_Java_.2F_Sun_Java_.2F_OpenJDK_Bugs

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: backport Heliosearch features to Solr

2015-03-12 Thread Damien Kamerman
Are there any results of off-heap cache vs JRE 8 with G1GC?

On 10 March 2015 at 11:13, Alexandre Rafalovitch arafa...@gmail.com wrote:

 Ask and you shall receive:
 SOLR-7210 Off-Heap filter cache
 SOLR-7211 Off-Heap field cache
 SOLR-7212 Parameter substitution
 SOLR-7214 JSON Facet API
 SOLR-7216 JSON Request API

 Regards,
Alex.
 P.s. Oh, the power of GMail filters :-)
 
 Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
 http://www.solr-start.com/


 On 9 March 2015 at 18:59, Markus Jelsma markus.jel...@openindex.io
 wrote:
  Ok, so what's next? Do you intend to open issues and send the links over
 here so interested persons can follow them? Clearly some would like to see
 features to merge. Let's see what the PMC thinks about it :)
 
  Cheers,
  M.
 
  -Original message-
  From:Yonik Seeley ysee...@gmail.com
  Sent: Monday 9th March 2015 19:53
  To: solr-user@lucene.apache.org
  Subject: Re: backport Heliosearch features to Solr
 
  Thanks everyone for voting!
 
  Result charts (note that these auto-generated charts don't show blanks
  as equivalent to 0)
 
 https://docs.google.com/forms/d/1gaMpNpHVdquA3q75yiFhqZhAWdWB-K6N8Jh3dBbWAU8/viewanalytics
 
  Raw results spreadsheet (correlations can be interesting), and
  percentages at the bottom.
 
 https://docs.google.com/spreadsheets/d/1uZ2qgOaKx1ZxJ_NKwj2zIAYFQ9fp8OrEPI5hqadcPeY/
 
  -Yonik
 
 
  On Sun, Mar 1, 2015 at 4:50 PM, Yonik Seeley ysee...@gmail.com wrote:
   As many of you know, I've been doing some work in the experimental
   heliosearch fork of Solr over the past year.  I think it's time to
   bring some more of those changes back.
  
   So here's a poll: Which Heliosearch features do you think should be
   brought back to Apache Solr?
  
   http://bit.ly/1E7wi1Q
   (link to google form)
  
   -Yonik
 




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-11 Thread Damien Kamerman
Didier, I'm starting to look at SOLR-6399
 after the core was unloaded, it was absent from the collection list, as
if it never existed. On the other hand, re-issuing a CREATE call with the
same collection restored the collection, along with its data
The collection is sill in ZK though?

 upon restart Solr tried to reload the previously-unloaded collection.
Looks like CoreContainer.load() uses CoreDescriptor.isTransient() and
CoreDescriptor.isLoadOnStartup() properties on startup.


On 7 March 2015 at 13:10, didier deshommes dfdes...@gmail.com wrote:

 It would be a huge step forward if one could have several hundreds of Solr
 collections, but only have a small portion of them opened/loaded at the
 same time. This is similar to ElasticSearch's close index api, listed here:

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html
 . I've opened an issue to implement the same in Solr here a few months ago:
 https://issues.apache.org/jira/browse/SOLR-6399

 On Thu, Mar 5, 2015 at 4:42 PM, Damien Kamerman dami...@gmail.com wrote:

  I've tried a few variations, with 3 x ZK, 6 X nodes, solr 4.10.3, solr
 5.0
  without any success and no real difference. There is a tipping point at
  around 3,000-4,000 cores (varies depending on hardware) from where I can
  restart the cloud OK within ~4min, to the cloud not working and
  continuous 'conflicting
  information about the leader of shard' warnings.
 
  On 5 March 2015 at 14:15, Shawn Heisey apa...@elyograg.org wrote:
 
   On 3/4/2015 5:37 PM, Damien Kamerman wrote:
I'm running on Solaris x86, I have plenty of memory and no real
 limits
# plimit 15560
15560:  /opt1/jdk/bin/java -d64 -server -Xss512k -Xms32G -Xmx32G
-XX:MaxMetasp
   resource  current maximum
  time(seconds) unlimited   unlimited
  file(blocks)  unlimited   unlimited
  data(kbytes)  unlimited   unlimited
  stack(kbytes) unlimited   unlimited
  coredump(blocks)  unlimited   unlimited
  nofiles(descriptors)  65536   65536
  vmemory(kbytes)   unlimited   unlimited
   
I've been testing with 3 nodes, and that seems OK up to around 3,000
   cores
total. I'm thinking of testing with more nodes.
  
   I have opened an issue for the problems I encountered while recreating
 a
   config similar to yours, which I have been doing on Linux.
  
   https://issues.apache.org/jira/browse/SOLR-7191
  
   It's possible that the only thing the issue will lead to is
 improvements
   in the documentation, but I'm hopeful that there will be code
   improvements too.
  
   Thanks,
   Shawn
  
  
 
 
  --
  Damien Kamerman
 




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-05 Thread Damien Kamerman
I've tried a few variations, with 3 x ZK, 6 X nodes, solr 4.10.3, solr 5.0
without any success and no real difference. There is a tipping point at
around 3,000-4,000 cores (varies depending on hardware) from where I can
restart the cloud OK within ~4min, to the cloud not working and
continuous 'conflicting
information about the leader of shard' warnings.

On 5 March 2015 at 14:15, Shawn Heisey apa...@elyograg.org wrote:

 On 3/4/2015 5:37 PM, Damien Kamerman wrote:
  I'm running on Solaris x86, I have plenty of memory and no real limits
  # plimit 15560
  15560:  /opt1/jdk/bin/java -d64 -server -Xss512k -Xms32G -Xmx32G
  -XX:MaxMetasp
 resource  current maximum
time(seconds) unlimited   unlimited
file(blocks)  unlimited   unlimited
data(kbytes)  unlimited   unlimited
stack(kbytes) unlimited   unlimited
coredump(blocks)  unlimited   unlimited
nofiles(descriptors)  65536   65536
vmemory(kbytes)   unlimited   unlimited
 
  I've been testing with 3 nodes, and that seems OK up to around 3,000
 cores
  total. I'm thinking of testing with more nodes.

 I have opened an issue for the problems I encountered while recreating a
 config similar to yours, which I have been doing on Linux.

 https://issues.apache.org/jira/browse/SOLR-7191

 It's possible that the only thing the issue will lead to is improvements
 in the documentation, but I'm hopeful that there will be code
 improvements too.

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-04 Thread Damien Kamerman
I'm running on Solaris x86, I have plenty of memory and no real limits
# plimit 15560
15560:  /opt1/jdk/bin/java -d64 -server -Xss512k -Xms32G -Xmx32G
-XX:MaxMetasp
   resource  current maximum
  time(seconds) unlimited   unlimited
  file(blocks)  unlimited   unlimited
  data(kbytes)  unlimited   unlimited
  stack(kbytes) unlimited   unlimited
  coredump(blocks)  unlimited   unlimited
  nofiles(descriptors)  65536   65536
  vmemory(kbytes)   unlimited   unlimited

I've been testing with 3 nodes, and that seems OK up to around 3,000 cores
total. I'm thinking of testing with more nodes.


On 5 March 2015 at 05:28, Shawn Heisey apa...@elyograg.org wrote:

 On 3/4/2015 2:09 AM, Shawn Heisey wrote:
  I've come to one major conclusion about this whole thing, even before
  I reach the magic number of 4000 collections. Thousands of collections
  is not at all practical with SolrCloud currently.

 I've now encountered a new problem.  I may have been hasty in declaring
 that an increase of jute.maxbuffer is not required.  There are now 3715
 collections, and I've seen a zookeeper exception that may indicate an
 increase actually is required.  I have added that parameter to the
 startup and when I have some time to look deeper, I will see whether
 that helps.

 Before 5.0, the maxbuffer would have been exceeded by only a few hundred
 collections ... so this is definitely progress.

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-03 Thread Damien Kamerman
After one minute from startup I sometimes see the
'org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes
published as DOWN in our cluster state.'
And I see the 'Still seeing conflicting information about the leader of
shard' after about 5 minutes.
Thanks Shawn, I will create an issue.

On 4 March 2015 at 01:10, Shawn Heisey apa...@elyograg.org wrote:

 On 3/3/2015 6:55 AM, Shawn Heisey wrote:
  With a longer zkClientTimeout, does the failure happen on a later
  collection?  I had hoped that it would solve the problem, but I'm
  curious about whether it was able to load more collections before it
  finally died, or whether it made no difference... and whether the
  message now indicates 40 seconds or if it still says 30.

 I have found the code that produces the message, and the wait for this
 particular section is hardcoded to 30 seconds.  That means the timeout
 won't affect it.

 If you move the Solr log so it creates a new one from startup, how long
 does it take after startup begins before you see the failure that
 indicates the conflicting leader information hasn't resolved?

 This most likely is a bug ... our SolrCloud experts will need to
 investigate to find it, so we need as much information as you can provide.

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-03 Thread Damien Kamerman
I've done a similar thing to create the collections. You're going to need
more memory I think.

OK, so maxThreads limit on jetty could be causing a distributed dead-lock?


On 4 March 2015 at 13:18, Shawn Heisey apa...@elyograg.org wrote:

 On 3/2/2015 12:54 AM, Damien Kamerman wrote:
  I still see the same cloud startup issue with Solr 5.0.0. I created 4,000
  collections from scratch and then attempted to stop/start the cloud.

 I have been trying to duplicate your setup using the -e cloud example
 included in the Solr 5.0 download and accepting all the defaults.  This
 sets up two Solr instances on one machine, one of which runs an embedded
 zookeeper.

 I have been running into a LOT of issues just trying to get so many
 collections created, to say nothing about restart problems.

 The first problem I ran into was heap size.  The example starts each of
 the Solr instances with a 512MB heap, which is WAY too small.  It
 allowed me to create 274 collections, in addition to the gettingstarted
 collection that the example started with.  One of the Solr instances
 simply crashed.  No OutOfMemoryException or anything else in the log ...
 it just died.

 I bumped the heap on each Solr instance to 4GB.  The next problem I ran
 into was the operating system limit on the number of processes ... and I
 had already bumped that up beyond the usual 1024 default, to 4096.  Solr
 was not able to create any more threads, because my user was not able to
 fork any more processes.  I got over 700 collections created before that
 became a problem.  My max open files had also been increased already --
 this is another place where a stock system will run into trouble
 creating a lot of collections.

 I fixed that, and the next problem I ran into was total RAM on the
 machine ... it turns out that with two Solr processes each using 4GB, I
 was dipped 3GB deep into swap.  This is odd, because I have 12GB of RAM
 on that machine and it's not doing very much besides this SolrCloud
 test.  Swapping means that performance was completely unacceptable and
 it would probably never finish.

 So ... I had to find a machine with more memory.  I've got a dev server
 with 32GB.  I fired up the two SolrCloud processes on it with 5GB heap
 each, with 32768 processes allowed.  I am in the process of building
 4000 collections (numShards=2, replicationFactor=1), and so far, it is
 working OK.  I have almost 2700 collections now.

 If I can ever get it to actually build 4000 collections, then I can
 attempt restarting the second Solr instance and see what happens.  I
 think I might hit another roadblock in the form of the
 1 maxThreads limit on Jetty.  Running this all on one machine might
 not be possible, but I'm giving it a try.

 Here's the script I am using to create all those collections:

 #!/bin/sh

 for i in `seq -f %04.0f 0 3999`
 do
   echo $i
   coll=mycoll${i}
   URL=http://localhost:8983/solr/admin/collections;
   URL=${URL}?action=CREATEname=${coll}numShards=2replicationFactor=1
   URL=${URL}collection.configName=gettingstarted
   curl $URL
 done

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-02 Thread Damien Kamerman
Still no luck starting solr with 40s zkClientTimeout. I'm not seeing any
expired sessions...

There must be a way to start solr with many collections. It runs fine..
until a restart is required.

On 3 March 2015 at 03:33, Shawn Heisey apa...@elyograg.org wrote:

 On 3/2/2015 12:54 AM, Damien Kamerman wrote:
  I still see the same cloud startup issue with Solr 5.0.0. I created 4,000
  collections from scratch and then attempted to stop/start the cloud.
 
  node1:
  WARN  - 2015-03-02 18:09:02.371;
  org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
  WARN  - 2015-03-02 18:10:07.196; org.apache.solr.cloud.ZkController;
 Timed
  out waiting to see all nodes published as DOWN in our cluster state.
  WARN  - 2015-03-02 18:13:46.238; org.apache.solr.cloud.ZkController;
 Still
  seeing conflicting information about the leader of shard shard1 for
  collection DD-3219 after 30 seconds; our state says
  http://host:8002/solr/DD-3219_shard1_replica1/, but ZooKeeper says
  http://host:8000/solr/DD-3219_shard1_replica2/
 
  node2:
  WARN  - 2015-03-02 18:09:01.871;
  org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
  WARN  - 2015-03-02 18:17:04.458;
  org.apache.solr.common.cloud.ZkStateReader$3; ZooKeeper watch triggered,
  but Solr cannot talk to ZK
  stop/start
  WARN  - 2015-03-02 18:53:12.725;
  org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
  WARN  - 2015-03-02 18:56:30.702; org.apache.solr.cloud.ZkController;
 Still
  seeing conflicting information about the leader of shard shard1 for
  collection DD-3581 after 30 seconds; our state says
  http://host:8001/solr/DD-3581_shard1_replica2/, but ZooKeeper says
  http://host:8002/solr/DD-3581_shard1_replica1/
 
  node3:
  WARN  - 2015-03-02 18:09:03.022;
  org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
  WARN  - 2015-03-02 18:10:08.178; org.apache.solr.cloud.ZkController;
 Timed
  out waiting to see all nodes published as DOWN in our cluster state.
  WARN  - 2015-03-02 18:13:47.737; org.apache.solr.cloud.ZkController;
 Still
  seeing conflicting information about the leader of shard shard1 for
  collection DD-2707 after 30 seconds; our state says
  http://host:8002/solr/DD-2707_shard1_replica2/, but ZooKeeper says
  http://host:8000/solr/DD-2707_shard1_replica1/

 I'm sorry to hear that 5.0 didn't fix the problem.  I really hoped that
 it would.

 There is one other thing I'd like to try before you file a bug --
 increasing zkClientTimeout to 40 seconds, to see whether it allows
 changes the point at which it fails (or allows it to succeed).  With the
 default tickTime (2 seconds), the maximum time you can set
 zkClientTimeout to is 40 seconds ... which in normal circumstances is a
 VERY long time.  In your situation, at least with the code in its
 current state, 30 seconds (I'm pretty sure this is the default in 5.0)
 may simply not be enough.


 https://cwiki.apache.org/confluence/display/solr/Parameter+Reference#ParameterReference-SolrCloudInstanceZooKeeperParameters

 I think filing a bug, even if 40 seconds allows this to succeed, is a
 good idea ... but you might want to wait for some of the cloud experts
 to look at your logs to see if they have anything to add.

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-03-01 Thread Damien Kamerman
I still see the same cloud startup issue with Solr 5.0.0. I created 4,000
collections from scratch and then attempted to stop/start the cloud.

node1:
WARN  - 2015-03-02 18:09:02.371;
org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
WARN  - 2015-03-02 18:10:07.196; org.apache.solr.cloud.ZkController; Timed
out waiting to see all nodes published as DOWN in our cluster state.
WARN  - 2015-03-02 18:13:46.238; org.apache.solr.cloud.ZkController; Still
seeing conflicting information about the leader of shard shard1 for
collection DD-3219 after 30 seconds; our state says
http://host:8002/solr/DD-3219_shard1_replica1/, but ZooKeeper says
http://host:8000/solr/DD-3219_shard1_replica2/

node2:
WARN  - 2015-03-02 18:09:01.871;
org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
WARN  - 2015-03-02 18:17:04.458;
org.apache.solr.common.cloud.ZkStateReader$3; ZooKeeper watch triggered,
but Solr cannot talk to ZK
stop/start
WARN  - 2015-03-02 18:53:12.725;
org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
WARN  - 2015-03-02 18:56:30.702; org.apache.solr.cloud.ZkController; Still
seeing conflicting information about the leader of shard shard1 for
collection DD-3581 after 30 seconds; our state says
http://host:8001/solr/DD-3581_shard1_replica2/, but ZooKeeper says
http://host:8002/solr/DD-3581_shard1_replica1/

node3:
WARN  - 2015-03-02 18:09:03.022;
org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog
WARN  - 2015-03-02 18:10:08.178; org.apache.solr.cloud.ZkController; Timed
out waiting to see all nodes published as DOWN in our cluster state.
WARN  - 2015-03-02 18:13:47.737; org.apache.solr.cloud.ZkController; Still
seeing conflicting information about the leader of shard shard1 for
collection DD-2707 after 30 seconds; our state says
http://host:8002/solr/DD-2707_shard1_replica2/, but ZooKeeper says
http://host:8000/solr/DD-2707_shard1_replica1/



On 27 February 2015 at 17:48, Shawn Heisey apa...@elyograg.org wrote:

 On 2/26/2015 11:14 PM, Damien Kamerman wrote:
  I've run into an issue with starting my solr cloud with many collections.
  My setup is:
  3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single
  server (256GB RAM).
  5,000 collections (1 x shard ; 2 x replica) = 10,000 cores
  1 x Zookeeper 3.4.6
  Java arg -Djute.maxbuffer=67108864 added to solr and ZK.
 
  Then I stop all nodes, then start all nodes. All replicas are in the down
  state, some have no leader. At times I have seen some (12 or so) leaders
 in
  the active state. In the solr logs I see lots of:
 
  org.apache.solr.cloud.ZkController; Still seeing conflicting information
  about the leader of shard shard1 for collection DD-4351 after 30
  seconds; our state says
 http://ftea1:8001/solr/DD-4351_shard1_replica1/,
  but ZooKeeper says http://ftea1:8000/solr/DD-4351_shard1_replica2/

 snip

  I've tried staggering the starts (1min) but does not help.
  I've reproduced with zero documents.
  Restarts are OK up to around 3,000 cores.
  Should this work?

 This is going to push SolrCloud beyond its limits.  Is this just an
 exercise to see how far you can push Solr, or are you looking at setting
 up a production install with several thousand collections?

 In Solr 4.x, the clusterstate is one giant JSON structure containing the
 state of the entire cloud.  With 5000 collections, the entire thing
 would need to be downloaded and uploaded at least 5000 times during the
 course of a successful full system startup ... and I think with
 replicationFactor set to 2, that might actually be 1 times. The
 best-case scenario is that it would take a VERY long time, the
 worst-case scenario is that concurrency problems would lead to a
 deadlock.  A deadlock might be what is happening here.

 In Solr 5.x, the clusterstate is broken up so there's a separate state
 structure for each collection.  This setup allows for faster and safer
 multi-threading and far less data transfer.  Assuming I understand the
 implications correctly, there might not be any need to increase
 jute.maxbuffer with 5.x ... although I have to assume that I might be
 wrong about that.

 I would very much recommend that you set your scenario up from scratch
 in Solr 5.0.0, to see if the new clusterstate format can eliminate the
 problem you're seeing.  If it doesn't, then we can pursue it as a likely
 bug in the 5.x branch and you can file an issue in Jira.

 Thanks,
 Shawn




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman
Oh, and I was wondering if 'leaderVoteWait' might help in Solr4.

On 27 February 2015 at 18:04, Damien Kamerman dami...@gmail.com wrote:

 This is going to push SolrCloud beyond its limits.  Is this just an
 exercise to see how far you can push Solr, or are you looking at setting
 up a production install with several thousand collections?


 I'm looking towards production.


 In Solr 4.x, the clusterstate is one giant JSON structure containing the
 state of the entire cloud.  With 5000 collections, the entire thing
 would need to be downloaded and uploaded at least 5000 times during the
 course of a successful full system startup ... and I think with
 replicationFactor set to 2, that might actually be 1 times. The
 best-case scenario is that it would take a VERY long time, the
 worst-case scenario is that concurrency problems would lead to a
 deadlock.  A deadlock might be what is happening here.


 Yes, clusterstate.json is 3.3M. At times on startup I think it does
 deadlock; log shows after 1min:
 org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes
 published as DOWN in our cluster state.


 In Solr 5.x, the clusterstate is broken up so there's a separate state
 structure for each collection.  This setup allows for faster and safer
 multi-threading and far less data transfer.  Assuming I understand the
 implications correctly, there might not be any need to increase
 jute.maxbuffer with 5.x ... although I have to assume that I might be
 wrong about that.

 I would very much recommend that you set your scenario up from scratch
 in Solr 5.0.0, to see if the new clusterstate format can eliminate the
 problem you're seeing.  If it doesn't, then we can pursue it as a likely
 bug in the 5.x branch and you can file an issue in Jira.


 Thanks, will test in Solr 5.0.0.




-- 
Damien Kamerman


Re: solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman

 This is going to push SolrCloud beyond its limits.  Is this just an
 exercise to see how far you can push Solr, or are you looking at setting
 up a production install with several thousand collections?


I'm looking towards production.


 In Solr 4.x, the clusterstate is one giant JSON structure containing the
 state of the entire cloud.  With 5000 collections, the entire thing
 would need to be downloaded and uploaded at least 5000 times during the
 course of a successful full system startup ... and I think with
 replicationFactor set to 2, that might actually be 1 times. The
 best-case scenario is that it would take a VERY long time, the
 worst-case scenario is that concurrency problems would lead to a
 deadlock.  A deadlock might be what is happening here.


Yes, clusterstate.json is 3.3M. At times on startup I think it does
deadlock; log shows after 1min:
org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes
published as DOWN in our cluster state.


 In Solr 5.x, the clusterstate is broken up so there's a separate state
 structure for each collection.  This setup allows for faster and safer
 multi-threading and far less data transfer.  Assuming I understand the
 implications correctly, there might not be any need to increase
 jute.maxbuffer with 5.x ... although I have to assume that I might be
 wrong about that.

 I would very much recommend that you set your scenario up from scratch
 in Solr 5.0.0, to see if the new clusterstate format can eliminate the
 problem you're seeing.  If it doesn't, then we can pursue it as a likely
 bug in the 5.x branch and you can file an issue in Jira.


Thanks, will test in Solr 5.0.0.


solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman
I've run into an issue with starting my solr cloud with many collections.
My setup is:
3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single
server (256GB RAM).
5,000 collections (1 x shard ; 2 x replica) = 10,000 cores
1 x Zookeeper 3.4.6
Java arg -Djute.maxbuffer=67108864 added to solr and ZK.

Then I stop all nodes, then start all nodes. All replicas are in the down
state, some have no leader. At times I have seen some (12 or so) leaders in
the active state. In the solr logs I see lots of:

org.apache.solr.cloud.ZkController; Still seeing conflicting information
about the leader of shard shard1 for collection DD-4351 after 30
seconds; our state says http://ftea1:8001/solr/DD-4351_shard1_replica1/,
but ZooKeeper says http://ftea1:8000/solr/DD-4351_shard1_replica2/

org.apache.solr.common.SolrException;
:org.apache.solr.common.SolrException: Error getting leader from zk for
shard shard1
at
org.apache.solr.cloud.ZkController.getLeader(ZkController.java:910)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:822)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:770)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: There is conflicting
information about the leader of shard: shard1 our state says:
http://ftea1:8001/solr/DD-1564_shard1_replica2/ but zookeeper says:
http://ftea1:8000/solr/DD-1564_shard1_replica1/
at
org.apache.solr.cloud.ZkController.getLeader(ZkController.java:889)
... 6 more

I've tried staggering the starts (1min) but does not help.
I've reproduced with zero documents.
Restarts are OK up to around 3,000 cores.
Should this work?

Damien.


Facet search and growing memory usage

2014-04-09 Thread Damien Kamerman
Hi All,

What I have found with Solr 4.6.0 to 4.7.1 is that memory usage continues
to grow with facet queries.

Originally I saw the issue with 40 facets over 60 collections (distributed
search). Memory usage would spike and solr would become unresponsive like
https://issues.apache.org/jira/browse/SOLR-2855

Then I tried to determine a safe limit at which the search would work
without breaking solr. But what I found is that I can break solr in the
same way with one facet (with many distinct values) and one collection. By
holding F5 (reload) in the browser for 10 seconds memory usage continues to
grow.

e.g.
http://localhost:8000/solr/collection/select?facet=truefacet.mincount=1q=*:*facet.threads=5facet.field=id

I realize that faceting on 'id' is extreme but it seems to highlight the
issue that memory usage continues to grow (leak?) with each new query until
solr eventually breaks.

This does not happen with the 'old' method 'facet.method=enum' - memory
usage is stable and solr is unbreakable with my hold-reload test.

This post
http://shal.in/post/285908948/inside-solr-improvements-in-faceted-search-performance
describes the new/current facet method and states
The structure is thrown away and re-created lazily on a commit. There
might be a few concerns around the garbage accumulated by the (re)-creation
of the many arrays needed for this structure. However, the performance gain
is significant enough to warrant the trade-off.

The wiki http://wiki.apache.org/solr/SimpleFacetParameters#facet.method
says the new/default method 'tends to use less memory'.

I use autoCommit (1min) on my collections - does mean there's a one minute
(or longer with no new docs) window where facet queries will effectively
'leak'?

Test setup. JDK 1.7.0u40 64-bit; Solr 4.7.1; 3 instances; 64GB each; 17m
docs; 2 replicas.

Cheers,
Damien.