Just confirmed that you do need to create the core directory before doing
the SHARDSPLIT (at least with HDFS) - otherwise it fails saying that it
cannot find classes - like the cluster classes.

Iv'e noticed that the disk usage on HDFS goes up when I do the split - for
example, if I split a 100G shard, the index size goes up by 100G with the
two new shards.  Is this correct for HDFS operation?
Thank you!

-Joe

On Mon, Nov 17, 2014 at 7:12 PM, Joseph Obernberger <
joseph.obernber...@gmail.com> wrote:

> Looks like the shard split failed, and only created one additional shard.
> I didn't allocate enough memory for 3x - since two additional shards needed
> to be created.  I was allocating 20G for each shard, so in order do the
> split, I needed to give 60G for the direct memory access.  I've now
> switched it to 10G, and run the split - that works, but I still need to
> build the directories before hand otherwise I get the cannot find class
> problem.
>
> Here are my HDFS parameters:
> <directoryFactory name="DirectoryFactory"
>         class="solr.HdfsDirectoryFactory">
>         <bool name="solr.hdfs.blockcache.enabled">true</bool>
>         <int name="solr.hdfs.blockcache.slab.count">80</int>
>         <bool
> name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
>         <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
>         <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
>         <bool name="solr.hdfs.blockcache.write.enabled">false</bool>
>         <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
>         <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">64</int>
>         <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">512</int>
>         <str name="solr.hdfs.home">hdfs://nameservice1:8020/solr6</str>
>         <str name="solr.hdfs.confdir">/etc/hadoop/conf.cloudera.hdfs1</str>
>     </directoryFactory>
>
> I did have the slab.count set to 160 before, and just didn't have the RAM
> to try this out.  The split is now running and I see the amount of space
> going into the new shards is increasing.  Looks like it's going to be
> overnight before it completes.
>
> -Joe
>
> On Mon, Nov 17, 2014 at 5:57 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Tell us more about your HDFS stuff. Specifically, how
>> do you have your HDFSDirectoryFactory specified in
>> solrconfig.xml?
>>
>> Cause you shouldn't have to do things like create the
>> directory ahead of time I don't think.
>>
>> Best,
>> Erick
>>
>> On Mon, Nov 17, 2014 at 12:17 PM, Joseph Obernberger
>> <joseph.obernber...@gmail.com> wrote:
>> > Originally I had two shards on two machines - shard1 and shard2.
>> > I did a SHARDSPLIT on shard1.
>> > Now have shard1, shard2, and shard1_0
>> > If I select the core (COLLECT_shard1_0_replica1) and execute a query, I
>> get
>> > all the docs OK, but if I specific &distrib=false, I get 0 documents.
>> >
>> > Under HDFS - when/how will the new core start to get data?
>> > Thank you!
>> >
>> > -Joe
>>
>
>

Reply via email to