Github user harshach commented on a diff in the pull request:

    https://github.com/apache/storm/pull/871#discussion_r44600086
  
    --- Diff: 
external/storm-hive/src/main/java/org/apache/storm/hive/bolt/HiveBolt.java ---
    @@ -134,22 +131,16 @@ public void execute(Tuple tuple) {
                         collector.ack(t);
                     tupleBatch.clear();
                 }
    +        } catch(SerializationError se) {
    +            LOG.info("Serialization exception occurred, tuple is 
acknowledged but not written to Hive.", tuple);
    +            collector.ack(tuple);
    --- End diff --
    
    @revans2 Hive SerializationError means the tuple fields to table column 
mapping is nto right. For example if table has 6 rows and this tuple has only 5 
fields than it can throw SerializationError. In this case instead of failing 
the tuple and keep re-running into this error. Its better we ack and log an 
error. 
    hiveWriter.write will throw an error on the current tuple and it wil be not 
be added to the tupleBatch.  Since this is only happening for current the other 
tuples in the batch are good and we can still proceed writing them to hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to