[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85983568 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutBatchMetadata.java --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.common.TopicPartition; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.List; + +/** + * Wraps transaction batch information + */ +public class KafkaTridentSpoutBatchMetadataimplements Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutBatchMetadata.class); + +private TopicPartition topicPartition; // topic partition of this batch +private long firstOffset; // first offset of this batch +private long lastOffset;// last offset of this batch + +public KafkaTridentSpoutBatchMetadata(TopicPartition topicPartition, long firstOffset, long lastOffset) { +this.topicPartition = topicPartition; +this.firstOffset = firstOffset; +this.lastOffset = lastOffset; +} + +public KafkaTridentSpoutBatchMetadata(TopicPartition topicPartition, ConsumerRecords consumerRecords, KafkaTridentSpoutBatchMetadata lastBatch) { +this.topicPartition = topicPartition; + +List > records = consumerRecords.records(topicPartition); + +if (records != null && !records.isEmpty()) { +firstOffset = records.get(0).offset(); +lastOffset = records.get(records.size() - 1).offset(); +} else { +if (lastBatch != null) { +firstOffset = lastBatch.firstOffset; +lastOffset = lastBatch.lastOffset; +} +} +LOG.debug("Created {}", this); --- End diff -- logging "this" will call the overridden toString() method for this class, which prints the first and last offset. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85983348 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountNamedTopics.java --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutRetryExponentialBackoff; +import org.apache.storm.kafka.spout.KafkaSpoutRetryService; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsNamedTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTupleBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderNamedTopics; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutManager; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutOpaque; +import org.apache.storm.trident.Stream; +import org.apache.storm.trident.TridentState; +import org.apache.storm.trident.TridentTopology; +import org.apache.storm.trident.operation.builtin.Count; +import org.apache.storm.trident.operation.builtin.Debug; +import org.apache.storm.trident.testing.Split; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; + +public class TridentKafkaClientWordCountNamedTopics extends TridentKafkaWordCount { +public TridentKafkaClientWordCountNamedTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +protected TridentState addTridentState(TridentTopology tridentTopology) { +final Stream spoutStream = tridentTopology.newStream("spout1", createOpaqueKafkaSpoutNew()).parallelismHint(1); + +return spoutStream.each(spoutStream.getOutputFields(), new Debug(true)) +.each(new Fields("str"), new Split(), new Fields("word")) +.groupBy(new Fields("word")) +.persistentAggregate(new DebugMemoryMapState.Factory(), new Count(), new Fields("count")); +} + +private KafkaTridentSpoutOpaquecreateOpaqueKafkaSpoutNew() { +return new KafkaTridentSpoutOpaque (getKafkaTridentManager()); +} + +private KafkaTridentSpoutManager getKafkaTridentManager() { +return new KafkaTridentSpoutManager<>(getKafkaSpoutConfig(getKafkaSpoutStreams())); +} + +private KafkaSpoutConfig getKafkaSpoutConfig(KafkaSpoutStreams kafkaSpoutStreams) { +return new KafkaSpoutConfig.Builder (getKafkaConsumerProps(), kafkaSpoutStreams, getTuplesBuilder(), getRetryService()) +.setOffsetCommitPeriodMs(10_000) +.setFirstPollOffsetStrategy(EARLIEST) +.setMaxUncommittedOffsets(250) +.build(); +} + +protected Map getKafkaConsumerProps() { +Map props = new HashMap<>(); +props.put(KafkaSpoutConfig.Consumer.BOOTSTRAP_SERVERS, "127.0.0.1:9092"); +props.put(KafkaSpoutConfig.Consumer.GROUP_ID, "kafkaSpoutTestGroup"); +props.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER, "org.apache.kafka.common.serialization.StringDeserializer"); +props.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER, "org.apache.kafka.common.serialization.StringDeserializer"); +props.put("max.partition.fetch.bytes", 200); +return props; +} + +
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85983373 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java --- @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.clients.consumer.OffsetAndMetadata; +import org.apache.kafka.common.TopicPartition; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.trident.operation.TridentCollector; +import org.apache.storm.trident.spout.IOpaquePartitionedTridentSpout; +import org.apache.storm.trident.topology.TransactionAttempt; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.LATEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_LATEST; + +public class KafkaTridentSpoutEmitterimplements IOpaquePartitionedTridentSpout.Emitter >, Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutEmitter.class); + +// Kafka +private final KafkaConsumer
kafkaConsumer; + +// Bookkeeping +private final KafkaTridentSpoutManager kafkaManager; +// Declare some KafkaTridentSpoutManager references for convenience +private final KafkaSpoutTuplesBuilder tuplesBuilder; +private final long pollTimeoutMs; +private final KafkaSpoutConfig.FirstPollOffsetStrategy firstPollOffsetStrategy; + +public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager kafkaManager) { +this.kafkaManager = kafkaManager; +this.kafkaManager.subscribeKafkaConsumer(); + +//must subscribeKafkaConsumer before this line +kafkaConsumer = kafkaManager.getKafkaConsumer(); + +tuplesBuilder = kafkaManager.getTuplesBuilder(); +final KafkaSpoutConfig kafkaSpoutConfig = kafkaManager.getKafkaSpoutConfig(); +pollTimeoutMs = kafkaSpoutConfig.getPollTimeoutMs(); +firstPollOffsetStrategy = kafkaSpoutConfig.getFirstPollOffsetStrategy(); +LOG.debug("Created {}", this); +} + +@Override +public KafkaTridentSpoutBatchMetadata emitPartitionBatch(TransactionAttempt tx, TridentCollector collector, +KafkaTridentSpoutTopicPartition partitionTs, KafkaTridentSpoutBatchMetadata lastBatch) { +LOG.debug("Emitting batch: [transaction = {}], [partition = {}], [collector = {}], [lastBatchMetadata = {}]", +tx, partitionTs, collector, lastBatch); + +final TopicPartition topicPartition = partitionTs.getTopicPartition(); +KafkaTridentSpoutBatchMetadata currentBatch = lastBatch; +Collection pausedTopicPartitions = Collections.EMPTY_SET; + +try { +// pause other topic partitions to only poll from current topic partition +pausedTopicPartitions = pauseTopicPartitions(topicPartition); + +seek(topicPartition,
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85981903 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountNamedTopics.java --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutRetryExponentialBackoff; +import org.apache.storm.kafka.spout.KafkaSpoutRetryService; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsNamedTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTupleBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderNamedTopics; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutManager; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutOpaque; +import org.apache.storm.trident.Stream; +import org.apache.storm.trident.TridentState; +import org.apache.storm.trident.TridentTopology; +import org.apache.storm.trident.operation.builtin.Count; +import org.apache.storm.trident.operation.builtin.Debug; +import org.apache.storm.trident.testing.Split; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; + +public class TridentKafkaClientWordCountNamedTopics extends TridentKafkaWordCount { +public TridentKafkaClientWordCountNamedTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +protected TridentState addTridentState(TridentTopology tridentTopology) { +final Stream spoutStream = tridentTopology.newStream("spout1", createOpaqueKafkaSpoutNew()).parallelismHint(1); + +return spoutStream.each(spoutStream.getOutputFields(), new Debug(true)) +.each(new Fields("str"), new Split(), new Fields("word")) +.groupBy(new Fields("word")) +.persistentAggregate(new DebugMemoryMapState.Factory(), new Count(), new Fields("count")); +} + +private KafkaTridentSpoutOpaquecreateOpaqueKafkaSpoutNew() { +return new KafkaTridentSpoutOpaque (getKafkaTridentManager()); --- End diff -- Partially Done in refactored examples in this [PR](https://github.com/apache/storm/pull/1757). There were some redundant "factory methods" that I removed. However, the code creating the "dependency" objects that need to be passed in is not 1 or two lines. I believe that a method with a meaningful name creating and initializing these "dependency" objects makes the code much more cohesive and easier to read. Furthermore, this class is extended for wildcard topics, and some of these methods overridden. I will be happy to write a more "copy" and "paste" like example in the docs if you feel it's appropriate. Please let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85980768 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/DebugMemoryMapState.java --- @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.storm.task.IMetricsContext; +import org.apache.storm.topology.FailedException; +import org.apache.storm.trident.state.CombinerValueUpdater; +import org.apache.storm.trident.state.State; +import org.apache.storm.trident.state.StateFactory; +import org.apache.storm.trident.state.ValueUpdater; +import org.apache.storm.trident.testing.MemoryMapState; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; +import java.util.Map; +import java.util.UUID; + +public class DebugMemoryMapState extends MemoryMapState { +private static final Logger LOG = LoggerFactory.getLogger(DebugMemoryMapState.class); + +private int updateCount = 0; + +public DebugMemoryMapState(String id) { +super(id); +} + +public List multiUpdate(Listkeys, List updaters) { +print(keys, updaters); +if ((updateCount++ % 5) == 0) { +LOG.error("Throwing FailedException"); +throw new FailedException("Enforced State Update Fail. On retrial should replay the exact same batch."); +} +return super.multiUpdate(keys, updaters); +} + +private void print(List
keys, List updaters) { +for (int i = 0; i < keys.size(); i++) { +ValueUpdater valueUpdater = updaters.get(i); +Object arg = ((CombinerValueUpdater) valueUpdater).getArg(); +LOG.debug("updateCount = {}, keys = {} => updaterArgs = {}", updateCount, keys.get(i), arg); --- End diff -- Done. Refactored examples in this [PR](https://github.com/apache/storm/pull/1757) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85980640 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountWildcardTopics.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.storm.kafka.spout.KafkaSpoutStream; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsWildcardTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderWildcardTopics; +import org.apache.storm.tuple.Fields; + +import java.util.regex.Pattern; + +public class TridentKafkaClientWordCountWildcardTopics extends TridentKafkaClientWordCountNamedTopics { +private static final String TOPIC_WILDCARD_PATTERN = "test-trident(-1)?"; + +public TridentKafkaClientWordCountWildcardTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +public static void main(String[] args) throws Exception { +final String[] zkBrokerUrl = parseUrl(args); --- End diff -- Agree. Refactored examples in this [PR](https://github.com/apache/storm/pull/1757) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r85643212 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java --- @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.clients.consumer.OffsetAndMetadata; +import org.apache.kafka.common.TopicPartition; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.trident.operation.TridentCollector; +import org.apache.storm.trident.spout.IOpaquePartitionedTridentSpout; +import org.apache.storm.trident.topology.TransactionAttempt; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.LATEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_LATEST; + +public class KafkaTridentSpoutEmitterimplements IOpaquePartitionedTridentSpout.Emitter >, Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutEmitter.class); + +// Kafka +private final KafkaConsumer
kafkaConsumer; + +// Bookkeeping +private final KafkaTridentSpoutManager kafkaManager; +// Declare some KafkaTridentSpoutManager references for convenience +private final KafkaSpoutTuplesBuilder tuplesBuilder; +private final long pollTimeoutMs; +private final KafkaSpoutConfig.FirstPollOffsetStrategy firstPollOffsetStrategy; + +public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager kafkaManager) { +this.kafkaManager = kafkaManager; +this.kafkaManager.subscribeKafkaConsumer(); + +//must subscribeKafkaConsumer before this line +kafkaConsumer = kafkaManager.getKafkaConsumer(); + +tuplesBuilder = kafkaManager.getTuplesBuilder(); +final KafkaSpoutConfig kafkaSpoutConfig = kafkaManager.getKafkaSpoutConfig(); +pollTimeoutMs = kafkaSpoutConfig.getPollTimeoutMs(); +firstPollOffsetStrategy = kafkaSpoutConfig.getFirstPollOffsetStrategy(); +LOG.debug("Created {}", this); +} + +@Override +public KafkaTridentSpoutBatchMetadata emitPartitionBatch(TransactionAttempt tx, TridentCollector collector, +KafkaTridentSpoutTopicPartition partitionTs, KafkaTridentSpoutBatchMetadata lastBatch) { +LOG.debug("Emitting batch: [transaction = {}], [partition = {}], [collector = {}], [lastBatchMetadata = {}]", +tx, partitionTs, collector, lastBatch); + +final TopicPartition topicPartition = partitionTs.getTopicPartition(); +KafkaTridentSpoutBatchMetadata currentBatch = lastBatch; +Collection pausedTopicPartitions = Collections.EMPTY_SET; + +try { +// pause other topic partitions to only poll from current topic partition +pausedTopicPartitions = pauseTopicPartitions(topicPartition); + +seek(topicPartition,
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83286131 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountNamedTopics.java --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutRetryExponentialBackoff; +import org.apache.storm.kafka.spout.KafkaSpoutRetryService; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsNamedTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTupleBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderNamedTopics; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutManager; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutOpaque; +import org.apache.storm.trident.Stream; +import org.apache.storm.trident.TridentState; +import org.apache.storm.trident.TridentTopology; +import org.apache.storm.trident.operation.builtin.Count; +import org.apache.storm.trident.operation.builtin.Debug; +import org.apache.storm.trident.testing.Split; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; + +public class TridentKafkaClientWordCountNamedTopics extends TridentKafkaWordCount { +public TridentKafkaClientWordCountNamedTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +protected TridentState addTridentState(TridentTopology tridentTopology) { +final Stream spoutStream = tridentTopology.newStream("spout1", createOpaqueKafkaSpoutNew()).parallelismHint(1); + +return spoutStream.each(spoutStream.getOutputFields(), new Debug(true)) +.each(new Fields("str"), new Split(), new Fields("word")) +.groupBy(new Fields("word")) +.persistentAggregate(new DebugMemoryMapState.Factory(), new Count(), new Fields("count")); +} + +private KafkaTridentSpoutOpaquecreateOpaqueKafkaSpoutNew() { +return new KafkaTridentSpoutOpaque (getKafkaTridentManager()); --- End diff -- can we merge this into single a method?. So that it shows the series of steps in creating a KafkaTrident topology. It has few redirections with one method calling another which can be confusing for the users looking for an example --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83312306 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java --- @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.clients.consumer.OffsetAndMetadata; +import org.apache.kafka.common.TopicPartition; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.trident.operation.TridentCollector; +import org.apache.storm.trident.spout.IOpaquePartitionedTridentSpout; +import org.apache.storm.trident.topology.TransactionAttempt; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.LATEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_LATEST; + +public class KafkaTridentSpoutEmitterimplements IOpaquePartitionedTridentSpout.Emitter >, Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutEmitter.class); + +// Kafka +private final KafkaConsumer
kafkaConsumer; + +// Bookkeeping +private final KafkaTridentSpoutManager kafkaManager; +// Declare some KafkaTridentSpoutManager references for convenience +private final KafkaSpoutTuplesBuilder tuplesBuilder; +private final long pollTimeoutMs; +private final KafkaSpoutConfig.FirstPollOffsetStrategy firstPollOffsetStrategy; + +public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager kafkaManager) { +this.kafkaManager = kafkaManager; +this.kafkaManager.subscribeKafkaConsumer(); + +//must subscribeKafkaConsumer before this line +kafkaConsumer = kafkaManager.getKafkaConsumer(); + +tuplesBuilder = kafkaManager.getTuplesBuilder(); +final KafkaSpoutConfig kafkaSpoutConfig = kafkaManager.getKafkaSpoutConfig(); +pollTimeoutMs = kafkaSpoutConfig.getPollTimeoutMs(); +firstPollOffsetStrategy = kafkaSpoutConfig.getFirstPollOffsetStrategy(); +LOG.debug("Created {}", this); +} + +@Override +public KafkaTridentSpoutBatchMetadata emitPartitionBatch(TransactionAttempt tx, TridentCollector collector, +KafkaTridentSpoutTopicPartition partitionTs, KafkaTridentSpoutBatchMetadata lastBatch) { +LOG.debug("Emitting batch: [transaction = {}], [partition = {}], [collector = {}], [lastBatchMetadata = {}]", +tx, partitionTs, collector, lastBatch); + +final TopicPartition topicPartition = partitionTs.getTopicPartition(); +KafkaTridentSpoutBatchMetadata currentBatch = lastBatch; +Collection pausedTopicPartitions = Collections.EMPTY_SET; + +try { +// pause other topic partitions to only poll from current topic partition +pausedTopicPartitions = pauseTopicPartitions(topicPartition); + +seek(topicPartition,
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83285410 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/DebugMemoryMapState.java --- @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.storm.task.IMetricsContext; +import org.apache.storm.topology.FailedException; +import org.apache.storm.trident.state.CombinerValueUpdater; +import org.apache.storm.trident.state.State; +import org.apache.storm.trident.state.StateFactory; +import org.apache.storm.trident.state.ValueUpdater; +import org.apache.storm.trident.testing.MemoryMapState; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; +import java.util.Map; +import java.util.UUID; + +public class DebugMemoryMapState extends MemoryMapState { +private static final Logger LOG = LoggerFactory.getLogger(DebugMemoryMapState.class); + +private int updateCount = 0; + +public DebugMemoryMapState(String id) { +super(id); +} + +public List multiUpdate(Listkeys, List updaters) { +print(keys, updaters); +if ((updateCount++ % 5) == 0) { +LOG.error("Throwing FailedException"); +throw new FailedException("Enforced State Update Fail. On retrial should replay the exact same batch."); +} +return super.multiUpdate(keys, updaters); +} + +private void print(List
keys, List updaters) { +for (int i = 0; i < keys.size(); i++) { +ValueUpdater valueUpdater = updaters.get(i); +Object arg = ((CombinerValueUpdater) valueUpdater).getArg(); +LOG.debug("updateCount = {}, keys = {} => updaterArgs = {}", updateCount, keys.get(i), arg); --- End diff -- should this just print with info level since this is a debugState why make another hop to enable debug for this topology. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83312172 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutBatchMetadata.java --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.common.TopicPartition; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.List; + +/** + * Wraps transaction batch information + */ +public class KafkaTridentSpoutBatchMetadataimplements Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutBatchMetadata.class); + +private TopicPartition topicPartition; // topic partition of this batch +private long firstOffset; // first offset of this batch +private long lastOffset;// last offset of this batch + +public KafkaTridentSpoutBatchMetadata(TopicPartition topicPartition, long firstOffset, long lastOffset) { +this.topicPartition = topicPartition; +this.firstOffset = firstOffset; +this.lastOffset = lastOffset; +} + +public KafkaTridentSpoutBatchMetadata(TopicPartition topicPartition, ConsumerRecords consumerRecords, KafkaTridentSpoutBatchMetadata lastBatch) { +this.topicPartition = topicPartition; + +List > records = consumerRecords.records(topicPartition); + +if (records != null && !records.isEmpty()) { +firstOffset = records.get(0).offset(); +lastOffset = records.get(records.size() - 1).offset(); +} else { +if (lastBatch != null) { +firstOffset = lastBatch.firstOffset; +lastOffset = lastBatch.lastOffset; +} +} +LOG.debug("Created {}", this); --- End diff -- probably useful to log the first and last offset of the batch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83289318 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountNamedTopics.java --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutRetryExponentialBackoff; +import org.apache.storm.kafka.spout.KafkaSpoutRetryService; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsNamedTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTupleBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderNamedTopics; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutManager; +import org.apache.storm.kafka.spout.trident.KafkaTridentSpoutOpaque; +import org.apache.storm.trident.Stream; +import org.apache.storm.trident.TridentState; +import org.apache.storm.trident.TridentTopology; +import org.apache.storm.trident.operation.builtin.Count; +import org.apache.storm.trident.operation.builtin.Debug; +import org.apache.storm.trident.testing.Split; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; + +public class TridentKafkaClientWordCountNamedTopics extends TridentKafkaWordCount { +public TridentKafkaClientWordCountNamedTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +protected TridentState addTridentState(TridentTopology tridentTopology) { +final Stream spoutStream = tridentTopology.newStream("spout1", createOpaqueKafkaSpoutNew()).parallelismHint(1); + +return spoutStream.each(spoutStream.getOutputFields(), new Debug(true)) +.each(new Fields("str"), new Split(), new Fields("word")) +.groupBy(new Fields("word")) +.persistentAggregate(new DebugMemoryMapState.Factory(), new Count(), new Fields("count")); +} + +private KafkaTridentSpoutOpaquecreateOpaqueKafkaSpoutNew() { +return new KafkaTridentSpoutOpaque (getKafkaTridentManager()); +} + +private KafkaTridentSpoutManager getKafkaTridentManager() { +return new KafkaTridentSpoutManager<>(getKafkaSpoutConfig(getKafkaSpoutStreams())); +} + +private KafkaSpoutConfig getKafkaSpoutConfig(KafkaSpoutStreams kafkaSpoutStreams) { +return new KafkaSpoutConfig.Builder (getKafkaConsumerProps(), kafkaSpoutStreams, getTuplesBuilder(), getRetryService()) +.setOffsetCommitPeriodMs(10_000) +.setFirstPollOffsetStrategy(EARLIEST) +.setMaxUncommittedOffsets(250) +.build(); +} + +protected Map getKafkaConsumerProps() { +Map props = new HashMap<>(); +props.put(KafkaSpoutConfig.Consumer.BOOTSTRAP_SERVERS, "127.0.0.1:9092"); +props.put(KafkaSpoutConfig.Consumer.GROUP_ID, "kafkaSpoutTestGroup"); +props.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER, "org.apache.kafka.common.serialization.StringDeserializer"); +props.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER, "org.apache.kafka.common.serialization.StringDeserializer"); +props.put("max.partition.fetch.bytes", 200); +return props; +} + +
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83290282 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java --- @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.kafka.spout.trident; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.clients.consumer.ConsumerRecords; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.clients.consumer.OffsetAndMetadata; +import org.apache.kafka.common.TopicPartition; +import org.apache.storm.kafka.spout.KafkaSpoutConfig; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.trident.operation.TridentCollector; +import org.apache.storm.trident.spout.IOpaquePartitionedTridentSpout; +import org.apache.storm.trident.topology.TransactionAttempt; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.LATEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_EARLIEST; +import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_LATEST; + +public class KafkaTridentSpoutEmitterimplements IOpaquePartitionedTridentSpout.Emitter >, Serializable { +private static final Logger LOG = LoggerFactory.getLogger(KafkaTridentSpoutEmitter.class); + +// Kafka +private final KafkaConsumer
kafkaConsumer; + +// Bookkeeping +private final KafkaTridentSpoutManager kafkaManager; +// Declare some KafkaTridentSpoutManager references for convenience +private final KafkaSpoutTuplesBuilder tuplesBuilder; +private final long pollTimeoutMs; +private final KafkaSpoutConfig.FirstPollOffsetStrategy firstPollOffsetStrategy; + +public KafkaTridentSpoutEmitter(KafkaTridentSpoutManager kafkaManager) { +this.kafkaManager = kafkaManager; +this.kafkaManager.subscribeKafkaConsumer(); + +//must subscribeKafkaConsumer before this line +kafkaConsumer = kafkaManager.getKafkaConsumer(); + +tuplesBuilder = kafkaManager.getTuplesBuilder(); +final KafkaSpoutConfig kafkaSpoutConfig = kafkaManager.getKafkaSpoutConfig(); +pollTimeoutMs = kafkaSpoutConfig.getPollTimeoutMs(); +firstPollOffsetStrategy = kafkaSpoutConfig.getFirstPollOffsetStrategy(); +LOG.debug("Created {}", this); +} + +@Override +public KafkaTridentSpoutBatchMetadata emitPartitionBatch(TransactionAttempt tx, TridentCollector collector, +KafkaTridentSpoutTopicPartition partitionTs, KafkaTridentSpoutBatchMetadata lastBatch) { +LOG.debug("Emitting batch: [transaction = {}], [partition = {}], [collector = {}], [lastBatchMetadata = {}]", +tx, partitionTs, collector, lastBatch); + +final TopicPartition topicPartition = partitionTs.getTopicPartition(); +KafkaTridentSpoutBatchMetadata currentBatch = lastBatch; +Collection pausedTopicPartitions = Collections.EMPTY_SET; + +try { +// pause other topic partitions to only poll from current topic partition +pausedTopicPartitions = pauseTopicPartitions(topicPartition); + +seek(topicPartition,
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
Github user harshach commented on a diff in the pull request: https://github.com/apache/storm/pull/1687#discussion_r83280245 --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentKafkaClientWordCountWildcardTopics.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.storm.starter.trident; + +import org.apache.storm.kafka.spout.KafkaSpoutStream; +import org.apache.storm.kafka.spout.KafkaSpoutStreams; +import org.apache.storm.kafka.spout.KafkaSpoutStreamsWildcardTopics; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilder; +import org.apache.storm.kafka.spout.KafkaSpoutTuplesBuilderWildcardTopics; +import org.apache.storm.tuple.Fields; + +import java.util.regex.Pattern; + +public class TridentKafkaClientWordCountWildcardTopics extends TridentKafkaClientWordCountNamedTopics { +private static final String TOPIC_WILDCARD_PATTERN = "test-trident(-1)?"; + +public TridentKafkaClientWordCountWildcardTopics(String zkUrl, String brokerUrl) { +super(zkUrl, brokerUrl); +} + +public static void main(String[] args) throws Exception { +final String[] zkBrokerUrl = parseUrl(args); --- End diff -- do we still need zookeeper config. This topology using new KafkaSpout right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1687: Apache master storm 1694 top storm 2097
GitHub user hmcl opened a pull request: https://github.com/apache/storm/pull/1687 Apache master storm 1694 top storm 2097 The Kafka Trident implementation is on top of the Trident logs improvement patch because they are related, and it makes it easier to merge the patch. There is already another PR for STORM-2097 You can merge this pull request into a Git repository by running: $ git pull https://github.com/hmcl/storm-apache Apache_master_STORM-1694_top_STORM-2097 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1687.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1687 commit 71465dc1fe7bb21c43e4b57bac7010105facd947 Author: Hugo LouroDate: 2016-06-21T16:28:09Z STORM-2097: Improve logging in trident core and examples - Improve logging in trident core, MasterBatchCoordinator, and examples - Added DebugMemoryMapState and test main for new Kafka client API commit a2d678d800daf24b87226593d731cc43d63caa72 Author: Hugo Louro Date: 2016-06-21T16:35:16Z STORM-1694: Kafka Spout Trident Implementation Using New Kafka Consumer API - Kafka New Client - Opaque Transactional Trident Spout Implementation - Implementation supporting multiple named topics and wildcard topics --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---