Mike Percy has posted comments on this change.

Change subject: - moved the kudu slume sink code from the flume repo to the 
kudu repo - applied Mike Percy's notes
......................................................................


Patch Set 1:

(6 comments)

Thanks for posting this @arae

http://gerrit.cloudera.org:8080/#/c/2383/1//COMMIT_MSG
Commit Message:

Line 7: - moved the kudu slume sink code from the flume repo to the kudu repo
> nit: can you format the commit message with a one-line summary, followed by
agree with this, also might want to mention KUDU-431


http://gerrit.cloudera.org:8080/#/c/2383/1/java/kudu-flume-sink/pom.xml
File java/kudu-flume-sink/pom.xml:

Line 45:       <artifactId>flume-ng-core</artifactId>
> hm, I'm surprised that we have a dependency on 'core' and not just the 'sdk
Yeah the SDK is for remote stuff and core is for in-process plugins. I don't 
think we need to include the SDK here, actually.


http://gerrit.cloudera.org:8080/#/c/2383/1/java/kudu-flume-sink/src/main/java/org/kududb/flume/sink/KuduEventSerializer.java
File 
java/kudu-flume-sink/src/main/java/org/kududb/flume/sink/KuduEventSerializer.java:

Line 31:  * of an event to write them to Kudu. This is configurable, so any 
config
> "serializer" is a bit of a strange term for this... we're not actually seri
serializer makes more sense in the context of HBase or HDFS. Here, I agree, we 
are transforming, not really serializing. KuduEventTransformer sounds better to 
me.


http://gerrit.cloudera.org:8080/#/c/2383/1/java/kudu-flume-sink/src/main/java/org/kududb/flume/sink/KuduSink.java
File java/kudu-flume-sink/src/main/java/org/kududb/flume/sink/KuduSink.java:

Line 84:     Preconditions.checkArgument(table == null || client == null || 
session == null, "Please call stop " +
> should be checkState
As mentioned in the other review, I think you meant && not || here


Line 193: incrementBatchCompleteCount
BatchCompleteCount should only be incremented in cases where a batch (a 
transaction) was completed that had a full number of events in it. e.g. when i 
== batchSize then you increment this counter once.


Line 205:         session.flush();
Have you considered specialized error handling here for when a duplicate event 
comes? You may want an option of dropping the event if an insert fails because 
of an 'Already Exists' error. There may be cases where Kudu is crashing or 
shutting down while applying something, and will time out but will have applied 
the insert. In that case, Flume will retry later with the same insert and be 
unable to make progress.

Kudu doesn't yet have an "INSERT OR OVERWRITE" operation.


-- 
To view, visit http://gerrit.cloudera.org:8080/2383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9c9897a0645e0bf43606e3bd07d648eeafd4224
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ara Ebrahimi <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to