ES guys told they will fix this in the next release, probably in like two months. We wanted to catch this exception so we could do a retry if ES cluster is down(we have some sort of SLA which ensure the users that every document will be indexed). We found two ways to fix it: 1. Do a request before the bulk insert to check if the cluster is ok - We didn't do this, because it's an extra hop for every request and when the cluster is ok it doesn't make sense to waste time querying for cluster state. 2. We performed the action with a timeout and we consider that if the timeout interval is reached the cluster is down. This is also not great, because sometimes you might have some bottlenecks somewhere and you can receive a timeout even though the cluster is up and running. So when choosing the timeout period you have to be very carefull, maybe run some stress/benchmarking tests before to see how fast you index. Just to be sure we set it to 30s.
luni, 4 august 2014, 22:36:31 UTC+3, Brian a scris: > > Alex, > > By the way, is this bug seen with the TransportClient also, or just the > NodeClient? > > Thanks! > > Brian > > On Monday, August 4, 2014 4:27:35 AM UTC-4, Alexander Reelsen wrote: >> >> Hey, >> >> Just a remote guess without knowing more: On your client side, the >> exception is wrapped, so you need to unwrap it first. >> >> >> --Alex >> >> >> On Wed, Jul 23, 2014 at 9:47 AM, Cosmin-Radu Vasii <[email protected] >> > wrote: >> >>> I am using the dataless NodeClient to connect to my cluster (version is >>> 1.1.1). Everything is working ok, except when failures occur. The scenario >>> is the following: >>> -I have an application java based which connects to ES Cluster >>> (application is started and the cluster is up and running) >>> -I shutdown the cluster >>> -I try to send a bulk request >>> -The following exception is displayed in the logs, which is normal. But >>> my call never catches the exception: >>> >>> Exception in thread "elasticsearch[Lasher][generic][T#6]" >>> org.elasticsearch.cluster.block.ClusterBlockException: blocked by: >>> [SERVICE_UNAVAILABLE/1/state not recovered / initialized];[SERVICE_UNAVAILA >>> BLE/2/no master]; >>> at >>> org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:138) >>> >>> at >>> org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:128) >>> >>> at >>> org.elasticsearch.action.bulk.TransportBulkAction.executeBulk(TransportBulkAction.java:197) >>> >>> at >>> org.elasticsearch.action.bulk.TransportBulkAction.access$000(TransportBulkAction.java:65) >>> >>> at >>> org.elasticsearch.action.bulk.TransportBulkAction$1.onFailure(TransportBulkAction.java:143) >>> >>> at >>> org.elasticsearch.action.support.TransportAction$ThreadedActionListener$2.run(TransportAction.java:117) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> at java.lang.Thread.run(Thread.java:724) >>> >>> My code is something like this >>> >>> BulkResponse response; >>> try { >>> response = requestBuilder.execute().actionGet(); >>> } >>> catch(NoNodeAvailableException ex){ >>> LOGGER.error("Cannot connect to ES Cluster: " + >>> ex.getMessage()); >>> throw ex; >>> } >>> catch (ClusterBlockException ex){ >>> LOGGER.error("Cannot connect to ES Cluster: " + >>> ex.getMessage()); >>> throw ex; >>> } >>> catch (Exception ex) { >>> >>> LOGGER.error("Exception in processing indexing request by ES >>> server. " + ex.getMessage()); >>> } >>> >>> When I use a single request everything is ok. I also noticed a TODO in >>> the ES code in the TransportBulkAction.java >>> >>> private void executeBulk(final BulkRequest bulkRequest, final long >>> startTime, final ActionListener<BulkResponse> listener, final >>> AtomicArray<BulkItemResponse> responses ) { >>> ClusterState clusterState = clusterService.state(); >>> // TODO use timeout to wait here if its blocked... >>> >>> clusterState.blocks().globalBlockedRaiseException(ClusterBlockLevel.WRITE); >>> >>> ....} >>> >>> Is this a known situation or a known bug or I am missing something? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/109057dc-70c4-471a-bd6d-8b8e72c37ff6%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/109057dc-70c4-471a-bd6d-8b8e72c37ff6%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e399092a-1fb8-4871-a6fa-267d912238ee%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
