Writing orc files with storm via java API

2017-07-25 Thread Igor Kuzmenko
Is there any implementation of storm bolt which can write files to HDFS in ORC format, without using Hive Streaming API? I've found java API for writing ORC files and I'm guessing is there any existing Hive bolts that uses it or any plans to create such?

Old kafka spout perfomance tuning

2017-03-22 Thread Igor Kuzmenko
Storm topology with old kafka spout connected to local kafka show great performance, which I'm satisfied with. But when I connect to external kafka, which is located on separate cluster, spout performance drops significantly. And the same topology works 10 times slower. I've already tried to

Re: New Kafka Spout doesn't move offset to the latest

2017-03-02 Thread Igor Kuzmenko
Sounds strange. Can you explain, how it will help me? 2 мар. 2017 г. 10:04 ПП пользователь "Sree V" <sree_at_ch...@yahoo.com> написал: > use different topology name and spout id and submit again. > > > Thanking you. > With Regards > Sree > > > On Thur

New Kafka Spout doesn't move offset to the latest

2017-03-02 Thread Igor Kuzmenko
I'm using storm-kafka-client 1.1.1-SNAPSHOT build. After topology start kafka spout read all partitions from kafka exept one: IdTopicPartitionLatest OffsetSpout Committed OffsetLag Kafka Spout gtp 0 5726714188 5726700216 13972 Kafka Spout gtp 1 5716936379 5716922137 14242 Kafka Spout gtp 2

Re: Kafka Spout enable.auto.commit=false

2017-02-21 Thread Igor Kuzmenko
ed by auto.commit.interval.ms ) the offset will > be committed. This can have have an impact on the delivery guarantees, > because an offset may be committed, yet the tuple may fail. > > On Feb 20, 2017, at 8:15 AM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > Hello, I'd like to

Kafka Spout enable.auto.commit=false

2017-02-20 Thread Igor Kuzmenko
Hello, I'd like to understand difference between auto commit mode true/false in new KafkaSpout. With auto.commit.enabled = false KafkaSpout will move my offset relying on acked tuples, it seems easy. But what happens if I turn auto commit on? How Kafka make decision which offset to commit?

Re: Kafka spout stops commiting offsets on some partitions

2017-02-16 Thread Igor Kuzmenko
gt; > On Feb 16, 2017, at 12:59 PM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > Thanks for reply Hugo. > I'll double check log tomorrow looking for KafkaSpoutRetryExponentialBackoff > calls. > > I just noticed, that in log I have there's strange thing. First message

Re: Kafka spout stops commiting offsets on some partitions

2017-02-16 Thread Igor Kuzmenko
ages, and may slow down processing considerably. > > You can also set the maxNumberOfRetires to a small number (e.g. 3-5) to > see if that solves this situation. > > Hugo > > > On Feb 16, 2017, at 8:36 AM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > Tod

Kafka spout stops commiting offsets on some partitions

2017-02-16 Thread Igor Kuzmenko
Today in Storm UI I saw this Kafka Spouts Lag: Id Topic Partition Latest Offset Spout Committed Offset Lag Kafka Spout test_topic 0 5591087 5562814 28273 Kafka Spout test_topic 1 2803256 2789090 14166 Kafka Spout test_topic 2 2801927 2787767 14160 Kafka Spout test_topic 3 2800627 2800626 1 Kafka

Re: Kafka monitor unable to get offset lag

2017-02-01 Thread Igor Kuzmenko
the arguments you pasted returns the spout lags correctly > ? Also, can you confirm if you are running in a secured setup or not ? > > > > *From: *Igor Kuzmenko <f1she...@gmail.com> > *Reply-To: *"user@storm.apache.org" <user@storm.apache.org> > *Date: *We

Re: Kafka monitor unable to get offset lag

2017-02-01 Thread Igor Kuzmenko
--new-consumer --bootstrap-server > :6667 --list > ./kafka-consumer-groups.sh --new-consumer --bootstrap-server > :6667 --describe --group > > I hope it helps. > Florin > > On Wed, Feb 1, 2017 at 11:01 AM, Igor Kuzmenko <f1she...@gmail.com> wrote: > >> Yes, to

Re: Kafka monitor unable to get offset lag

2017-02-01 Thread Igor Kuzmenko
hone > > On Jan 31, 2017, at 4:34 AM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > I've launched topology with new kafka spout. Topology by it self working > fine, but looking at storm UI I see kafka-monitor exception: > *Unable to get offset lags for kafka. Reason: > org

Kafka monitor unable to get offset lag

2017-01-31 Thread Igor Kuzmenko
I've launched topology with new kafka spout. Topology by it self working fine, but looking at storm UI I see kafka-monitor exception: *Unable to get offset lags for kafka. Reason: org.apache.kafka.shaded.common.errors.TimeoutException: Timeout expired while fetching topic metadata* Maybe I forgot

Re: Kafka spout stops emmiting messages

2017-01-24 Thread Igor Kuzmenko
can change the number to a large number. > > ------ > Josh > > *From:* Igor Kuzmenko <f1she...@gmail.com> > *Date:* 2017-01-24 02:28 > *To:* user <user@storm.apache.org> > *Subject:* Kafka spout stops emmiting messages > Hello, I'm

Kafka spout stops emmiting messages

2017-01-23 Thread Igor Kuzmenko
Hello, I'm trying to upgrade my topology from old kafka spout (storm- kafka project) to new one (storm-kafka-client) version 1.0.1. I've configured new spout

Re: Massive Number of Spout Failures

2016-07-27 Thread Igor Kuzmenko
We have such fails with two reasons: 1) Bolt doesn't ack tuple immidiatly, but collects a batch and at some point ack's them all. In that case thes situation when batch bigger than max_spout_pending and some tuples fails. 2) Bolt doesn't ack tuple at all. Make sure Bolt acks or fails tuples

Max spout pending blocks tick tuple?

2016-07-26 Thread Igor Kuzmenko
Hello, I'd like to know does max_spout_pending setting affect emmiting tick tuple? The case is when I reached max_spout_pending value and spout stop emiting new tuples, does toplogy stop emmiting tick tuples?

Where kafka spout stores offset

2016-06-28 Thread Igor Kuzmenko
I'm using Storm v 0.10.0. Offset obviously stored in zookeeper, but in wich one? First zookeeper I set in STORM settings: "storm.zookeeper.servers". Second one is set in ZkHosts object, which is part of SpoutConfig: public ZkHosts(String brokerZkStr) Right now they are the same, in my case, but

Re: Topology code distribution takes too much time

2016-03-27 Thread Igor Kuzmenko
oodin...@gmail.com> > wrote: > >> Hi Igor. >> Try to dump threads and look on it. I think you can find problem in dump. >> >> Look at lan, cpu and ram load. Maybe you CPU overloaded or ram and lan. >> >> 2016-03-27 21:26 GMT+03:00 Igor Kuzmenko <f1

Topology code distribution takes too much time

2016-03-27 Thread Igor Kuzmenko
Hello, I'm using Hortonworks Data Platform v2.3.4 with included storm 0.10. After deploying my topology using "*storm jar*" command it takes about 3 min to distribute code. The network is 10Gb/s, topology jar is about 150MB, cluster has 2 nodes with supervisors on them, so I assume that this

Re: Storm creates lots of .tmp files

2016-03-24 Thread Igor Kuzmenko
om a topology. >> >> Searching google for "pipeout file" implies they might be coming from >> Hive: >> >>- >> >> https://www.google.com/webhp?sourceid=chrome-instant=1=2=UTF-8#safe=off=pipeout%20file >> >> So do you have topologies

Re: Storm creates lots of .tmp files

2016-03-24 Thread Igor Kuzmenko
ing are stuffing there? e.g., the > org.xerial.snappy:snappy-java library spews .so files all over /tmp for me.) > > What is the version of storm you are using? > > Do you have some storm configuration setting pointing at /tmp/storm? > > - Erik > > On Wed, Mar 23, 2016 at 9:29 AM, Igo

Storm creates lots of .tmp files

2016-03-23 Thread Igor Kuzmenko
Today I get "*java.io.IOException: No space left on device*" because there was no more inodes left. The reason is that I've got a thousands of .tmp files under /tmp/storm created by storm. Why theres so much and what for?