Re: Unbalanced CPU no SolrCloud

Mahmoud Almokadem Mon, 16 Oct 2017 08:35:32 -0700

It takes more time after I stopped the indexing.

The load firstly was with the first node and after I restarted the indexing
process the load with changed to the second node the first node worked
properly.


Thanks,
Mahmoud


On Mon, Oct 16, 2017 at 5:29 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Does the load stops when you stop indexing or it last for some more time?
> Is it always one node that behaves like this and it starts as soon as you
> start indexing? Is load different between nodes when you are doing lighter
> indexing?
>
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 13:35, Mahmoud Almokadem <prog.mahm...@gmail.com>
> wrote:
> >
> > The transition of the load happened after I restarted the bulk insert
> > process.
> >
> > The size of the index on each server about 500GB.
> >
> > There are about 8 warnings on each server for "Not found segment file"
> like
> > that
> >
> > Error getting file length for [segments_2s4]
> >
> > java.nio.file.NoSuchFileException:
> > /media/ssd_losedata/solr-home/data/documents_online_shard16_
> replica_n1/data/index/segments_2s4
> > at
> > java.base/sun.nio.fs.UnixException.translateToIOException(
> UnixException.java:92)
> > at
> > java.base/sun.nio.fs.UnixException.rethrowAsIOException(
> UnixException.java:111)
> > at
> > java.base/sun.nio.fs.UnixException.rethrowAsIOException(
> UnixException.java:116)
> > at
> > java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(
> UnixFileAttributeViews.java:55)
> > at
> > java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(
> UnixFileSystemProvider.java:145)
> > at
> > java.base/sun.nio.fs.LinuxFileSystemProvider.readAttributes(
> LinuxFileSystemProvider.java:99)
> > at java.base/java.nio.file.Files.readAttributes(Files.java:1755)
> > at java.base/java.nio.file.Files.size(Files.java:2369)
> > at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
> > at
> > org.apache.lucene.store.NRTCachingDirectory.fileLength(
> NRTCachingDirectory.java:128)
> > at
> > org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(
> LukeRequestHandler.java:611)
> > at
> > org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(
> LukeRequestHandler.java:584)
> > at
> > org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(
> LukeRequestHandler.java:136)
> > at
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:177)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:378)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:322)
> > at
> > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:582)
> > at
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> > at
> > org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> > at
> > org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> > at
> > org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
> > at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)
> > at
> > org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> > at
> > org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
> > at
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> > at
> > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> > at
> > org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> > at
> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> > at
> > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> RewriteHandler.java:335)
> > at
> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> > at org.eclipse.jetty.server.Server.handle(Server.java:534)
> > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> > at
> > org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
> > at
> > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
> > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > at
> > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
> > at
> > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
> > at
> > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
> > at
> > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)
> > at
> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
> > at
> > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
> > at java.base/java.lang.Thread.run(Thread.java:844)
> >
> > On Mon, Oct 16, 2017 at 1:08 PM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> I did not look at graph details - now I see that it is over 3h time
> span.
> >> It seems that there was a load on the other server before this one and
> >> ended with 14GB read spike and 10GB write spike, just before load
> started
> >> on this server. Do you see any errors or suspicious logs lines?
> >> How big is your index?
> >>
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 16 Oct 2017, at 12:39, Mahmoud Almokadem <prog.mahm...@gmail.com>
> >> wrote:
> >>>
> >>> Yes, it's constantly since I started this bulk indexing process.
> >>> As you see the write operations on the loaded server are 3x the normal
> >>> server despite Disk writes not 3x times.
> >>>
> >>> Mahmoud
> >>>
> >>>
> >>> On Mon, Oct 16, 2017 at 12:32 PM, Emir Arnautović <
> >>> emir.arnauto...@sematext.com> wrote:
> >>>
> >>>> Hi Mahmoud,
> >>>> Is this something that you see constantly? Network charts suggests
> that
> >>>> your servers are loaded equally and as you said - you are not using
> >> routing
> >>>> so expected. Disk read/write and CPU are not equal and it is expected
> to
> >>>> not be equal during heavy indexing since it also triggers segment
> merges
> >>>> which require those resources. Even if host same documents (e.g.
> leader
> >> and
> >>>> replica) merges are not likely to happen at the same time and you can
> >>>> expect to see such cases.
> >>>>
> >>>> Thanks,
> >>>> Emir
> >>>> --
> >>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>
> >>>>
> >>>>
> >>>>> On 16 Oct 2017, at 11:58, Mahmoud Almokadem <prog.mahm...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Here are the screen shots for the two server metrics on Amazon
> >>>>>
> >>>>> https://ibb.co/kxBQam
> >>>>> https://ibb.co/fn0Jvm
> >>>>> https://ibb.co/kUpYT6
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Oct 16, 2017 at 11:37 AM, Mahmoud Almokadem <
> >>>> prog.mahm...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Emir,
> >>>>>>
> >>>>>> We doesn't use routing.
> >>>>>>
> >>>>>> Servers is already balanced and the number of documents on each
> shard
> >>>> are
> >>>>>> approximately the same.
> >>>>>>
> >>>>>> Nothing running on the servers except Solr and ZooKeeper.
> >>>>>>
> >>>>>> I initialized the client as
> >>>>>>
> >>>>>> String zkHost = "192.168.1.89:2181,192.168.1.99:2181";
> >>>>>>
> >>>>>> CloudSolrClient solrCloud = new CloudSolrClient.Builder()
> >>>>>>                  .withZkHost(zkHost)
> >>>>>>                  .build();
> >>>>>>
> >>>>>>          solrCloud.setIdField("document_id");
> >>>>>>          solrCloud.setDefaultCollection(collection);
> >>>>>>          solrCloud.setRequestWriter(new BinaryRequestWriter());
> >>>>>>
> >>>>>>
> >>>>>> And the documents are approximately the same size.
> >>>>>>
> >>>>>> I Used 10 threads with 10 SolrClients to send data to solr and every
> >>>>>> thread send a batch of 1000 documents every time.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Mahmoud
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Oct 16, 2017 at 11:01 AM, Emir Arnautović <
> >>>>>> emir.arnauto...@sematext.com> wrote:
> >>>>>>
> >>>>>>> Hi Mahmoud,
> >>>>>>> Do you use routing? Are your servers equally balanced - do you end
> up
> >>>>>>> having approximately the same number of documents hosted on both
> >>>> servers
> >>>>>>> (counted all shards)?
> >>>>>>> Do you have anything else running on those servers?
> >>>>>>> How do you initialise your SolrJ client?
> >>>>>>> Are documents of similar size?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Emir
> >>>>>>> --
> >>>>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>>>> Solr & Elasticsearch Consulting Support Training -
> >>>> http://sematext.com/
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 16 Oct 2017, at 10:46, Mahmoud Almokadem <
> prog.mahm...@gmail.com
> >>>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> We've installed SolrCloud 7.0.1 with two nodes and 8 shards per
> >> node.
> >>>>>>>>
> >>>>>>>> The configurations and the specs of the two servers are identical.
> >>>>>>>>
> >>>>>>>> When running bulk indexing using SolrJ we see one of the servers
> is
> >>>>>>> fully
> >>>>>>>> loaded as you see on the images and the other is normal.
> >>>>>>>>
> >>>>>>>> Images URLs:
> >>>>>>>>
> >>>>>>>> https://ibb.co/jkE6gR
> >>>>>>>> https://ibb.co/hyzvam
> >>>>>>>> https://ibb.co/mUpvam
> >>>>>>>> https://ibb.co/e4bxo6
> >>>>>>>>
> >>>>>>>> How can I figure this issue?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Mahmoud
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Unbalanced CPU no SolrCloud

Reply via email to