Github user harshach commented on a diff in the pull request:
https://github.com/apache/storm/pull/871#discussion_r44600086
--- Diff:
external/storm-hive/src/main/java/org/apache/storm/hive/bolt/HiveBolt.java ---
@@ -134,22 +131,16 @@ public void execute(Tuple tuple) {
collector.ack(t);
tupleBatch.clear();
}
+ } catch(SerializationError se) {
+ LOG.info("Serialization exception occurred, tuple is
acknowledged but not written to Hive.", tuple);
+ collector.ack(tuple);
--- End diff --
@revans2 Hive SerializationError means the tuple fields to table column
mapping is nto right. For example if table has 6 rows and this tuple has only 5
fields than it can throw SerializationError. In this case instead of failing
the tuple and keep re-running into this error. Its better we ack and log an
error.
hiveWriter.write will throw an error on the current tuple and it wil be not
be added to the tupleBatch. Since this is only happening for current the other
tuples in the batch are good and we can still proceed writing them to hive.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---