[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 7: Nice work Will! -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has submitted this change and it was merged. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a URL or as a literal. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Reviewed-on: http://gerrit.cloudera.org:8080/4034 Tested-by: Kudu Jenkins Reviewed-by: Mike Percy --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java 4 files changed, 560 insertions(+), 0 deletions(-) Approvals: Mike Percy: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 7: Build Started http://104.196.14.100/job/kudu-gerrit/3231/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 82: ${project.basedir}/src/main/avro/ > this should not be in src/main it should be in src/test since this is only Done Line 85: > nit: this line should be un-indented by 2 spaces Done http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc File java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc: Line 5: {"name": "key", "type": "int"}, > I don't think we should name these fields with names that match types exact Done http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 136: assertTrue("incorrect status for empty channel", status == Sink.Status.BACKOFF); > assertEquals("incorrect status for empty channel", status, Sink.Status.BACK Done Line 138: assertTrue("incorrect status for non-empty channel", status != Sink.Status.BACKOFF); > assertEquals("incorrect status for non-empty channel", status, Sink.Status. Done Line 148: ArrayList columns = new ArrayList<>(3); > nit: initial capacity should be 5 Done Line 179: record.setLong$(2L * i); > See my comments about this $ syntax in the .avsc file -- let's just rename Done -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#7). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a URL or as a literal. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java 4 files changed, 560 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/7 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 6: (7 comments) Found a few more test nits. I think this is nearly ready to commit now. http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 82: ${project.basedir}/src/main/avro/ this should not be in src/main it should be in src/test since this is only used in the test module. In fact, that is the default for the generate-test-sources goal and you should be able to leave off the configuration section completely from here. I just tested this and it's true, you can remove this section. Line 85: nit: this line should be un-indented by 2 spaces http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc File java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc: Line 5: {"name": "key", "type": "int"}, I don't think we should name these fields with names that match types exactly. It makes the generated Avro method names a bit weird, and people reading the tests might be a little confused. Instread of "long", how about "longField" or something? Instead of "null", how about "nullableStringField"? Suggestions: * key (this is alright, or intKey) * longField * doubleField * nullableStringField * stringField http://gerrit.cloudera.org:8080/#/c/4034/6/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 136: assertTrue("incorrect status for empty channel", status == Sink.Status.BACKOFF); assertEquals("incorrect status for empty channel", status, Sink.Status.BACKOFF); Line 138: assertTrue("incorrect status for non-empty channel", status != Sink.Status.BACKOFF); assertEquals("incorrect status for non-empty channel", status, Sink.Status.READY); Line 148: ArrayList columns = new ArrayList<>(3); nit: initial capacity should be 5 Line 179: record.setLong$(2L * i); See my comments about this $ syntax in the .avsc file -- let's just rename these fields not to conflict with type names. -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: > BTW this producer is starting to look great. :D -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 6: Build Started http://104.196.14.100/job/kudu-gerrit/3230/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/4034/5/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 57: * > nit: i don't these extra s do anything. Do they? Done Line 259: private Schema schema(Event event) throws FlumeException { > mind naming this getSchema() for consistency with Java style? Done http://gerrit.cloudera.org:8080/#/c/4034/5/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 61: private static final String schemaLiteral = > you can use Guava and do: Done Line 111: private void testEvents(int eventCount, String whereIsSchema) > style nit: let's create a SchemaLocation enum for whereIsSchema instead of Done Line 118: : new Context(ImmutableMap.of("producer.schemaPath", schemaURI)); > style nit: Use constants for config values: Done Line 175: GenericRecord record = new GenericData.Record(schema); > Would you mind using SpecificRecord instead of GenericRecord when writing t I think I did this right. -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#6). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a URL or as a literal. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/avro/testAvroEventProducer.avsc A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java 4 files changed, 563 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/6 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: BTW this producer is starting to look great. -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/4034/5/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 57: * nit: i don't these extra s do anything. Do they? Line 259: private Schema schema(Event event) throws FlumeException { mind naming this getSchema() for consistency with Java style? http://gerrit.cloudera.org:8080/#/c/4034/5/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 61: private static final String schemaLiteral = you can use Guava and do: schemaLiteral = Files.toString(new File(schemaPath), Charsets.UTF_8); Line 111: private void testEvents(int eventCount, String whereIsSchema) style nit: let's create a SchemaLocation enum for whereIsSchema instead of using a String Line 118: : new Context(ImmutableMap.of("producer.schemaPath", schemaURI)); style nit: Use constants for config values: import static KuduSinkConfigurationConstants.PRODUCER_PREFIX; import static AvroKuduEventProducer.SCHEMA_PROP; : new Context(ImmutableMap.of(PRODUCER_PREFIX + SCHEMA_PROP, schemaURI)); Line 175: GenericRecord record = new GenericData.Record(schema); Would you mind using SpecificRecord instead of GenericRecord when writing the events? That way we can be sure writing via SpecificRecord works fine when decoding with GenericRecord with Kudu. You can do it like this: 1. Move the avro schema to src/test/avro/ 2. Make the following changes to the pom: @@ -23,6 +23,7 @@ +1.8.1 1.6.0 2.7.0 @@ -67,6 +68,19 @@ + +org.apache.avro +avro-maven-plugin +${avro.version} + + +generate-test-sources + + schema + + + + @@ -92,6 +106,13 @@ + org.apache.avro + avro + ${avro.version} + provided + + + org.apache.hadoop hadoop-client ${hadoop.version} Then Maven should generate the AvroKuduEventProducerTestRecord class for you. I sort-of tested the codegen and it seems to work. -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: > Build Failed > > http://104.196.14.100/job/kudu-gerrit/3158/ : FAILURE All the KuduSink tests passed. I think this one is on Jenkins. -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#5). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a URL or as a literal. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java A java/kudu-flume-sink/src/test/resources/testAvroEventProducer.avsc 4 files changed, 538 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/5 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 5: Build Started http://104.196.14.100/job/kudu-gerrit/3158/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 4: (7 comments) Did a significant rewrite to make it more similar to the DataSet Sink implementation. Sorry if this pushes it further from the finish line but I think it will make it better in the end. It'll get another rewrite when the KuduEventProducer API changes, too. http://gerrit.cloudera.org:8080/#/c/4034/4/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 60: * producer.schema.path > should be schemaPath Done Line 63: * The location of the Avro schema file used to deserialize the Avro-encoded event bodies. > Add: If not specified, the schema must be specified on a per-event basis. Done Line 73: private static final String SCHEMA_HEADER = "schemaPath"; > Let's use the same property as the Kite sink for the event headers so that Done Line 110: String.format("No schema for event! Specify either property %s or event header %s", > s/property/configuration property/ Done Line 121: payloadReader = new DataFileReader<>(new SeekableByteArrayInput(payload), reader); > We should not treat each Flume Event as an Avro DataFile. We should be trea Did a significant rewrite based on AvroParser. Line 150: if (!col.isNullable()) { > I think this should be: Done. Made a note to myself fix later. Line 202: private DatumReader openSchema(String schemaPath) { > It would be nice to support both URL and literal. See how the DatasetSink i Done -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/4034/4/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 60: * producer.schema.path should be schemaPath Line 63: * The location of the Avro schema file used to deserialize the Avro-encoded event bodies. Add: If not specified, the schema must be specified on a per-event basis. Line 73: private static final String SCHEMA_HEADER = "schemaPath"; Let's use the same property as the Kite sink for the event headers so that both Flume sinks can operate on the same events and provide the same behavior. That means we should use "flume.avro.schema.url" for the header property. Line 110: String.format("No schema for event! Specify either property %s or event header %s", s/property/configuration property/ Line 121: payloadReader = new DataFileReader<>(new SeekableByteArrayInput(payload), reader); We should not treat each Flume Event as an Avro DataFile. We should be treating each Flume Event as an Avro record. I recommend looking at the Flume DatasetSink implementation and doing the same thing. Here is how they parse the events: https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/AvroParser.java#L152 Line 150: if (!col.isNullable()) { I think this should be: if (value == null) { if (col.isNullable()) { row.setNull(name); } else { // leave unset for possible Kudu default } ...actually, it seems like we are missing something from the client API like ColumnSchema.hasDefaultValue() because if we could check that then we could throw if the field is null and a default does not exist. We should probably fix that in a follow-up. Line 202: private DatumReader openSchema(String schemaPath) { It would be nice to support both URL and literal. See how the DatasetSink implements this here: https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-dataset-sink/src/main/java/org/apache/flume/sink/kite/parser/AvroParser.java#L175 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 74: hadoop-client > Yeah, I would also like to see this dep shaded. Done http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: PS1, Line 52: producer.schema > Flume sinks typically use camelCaps like schemaPath for configuration optio Done Line 89: public void configure(ComponentConfiguration conf) { > Hrm. You sure you had to override this? I had to. But now I notice that this method, which is required because KuduEventProducer implements ConfigurableComponent, is empty-bodied for every KuduEventProducer implementation. Maybe we should take it out? Or implement it for real for the KuduEventProducer implementations? Something for the follow-up either way. Line 93: public void initialize(Event event, KuduTable table) { > I don't think there are many users yet and I'd be OK with changing it pre-1 Will improve API and names in a follow-up. Line 133: row.addBoolean(name, (boolean) value); > If we use a single schema for the sink then we could probably do the valida Will do it like this: can optionally specify a global schema, validated up-front against the table (validation will have to wait for the reorg in follow-up patch), or a schema in the event header, which will override any global schema. http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 149: parameters.put(KuduSinkConfigurationConstants.PRODUCER, "org.apache.kudu.flume.sink.AvroKuduEventProducer"); > nit: long line Done -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 4: Build Started http://104.196.14.100/job/kudu-gerrit/3089/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#4). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a path. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java A java/kudu-flume-sink/src/test/resources/testAvroEventProducer.avsc 4 files changed, 467 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/4 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#3). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time or in the event header, as a path. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java A java/kudu-flume-sink/src/test/resources/testAvroEventProducer.avsc A java/kudu-flume-sink/src/test/resources/testCustomSchemaAcroKuduEventProducer.avsc 5 files changed, 477 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/3 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 3: Build Started http://104.196.14.100/job/kudu-gerrit/3073/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Mike Percy has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 1: (6 comments) This is good stuff. http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 74: hadoop-client > To be honest, I wasn't sure about this, so I copied off of the map/reduce i Yeah, I would also like to see this dep shaded. http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: PS1, Line 52: producer.schema > Done Flume sinks typically use camelCaps like schemaPath for configuration options. Line 89: public void configure(ComponentConfiguration conf) { Hrm. You sure you had to override this? Line 93: public void initialize(Event event, KuduTable table) { > Yup, it's weird. I don't think there are many users yet and I'd be OK with changing it pre-1.0. We can change it in a follow-up patch though. What I think would be better would be KuduEventProducer.initialize(KuduTable table) called once per KuduSink.start() -- after configure() is called. Then change KuduEventProducer.getOperations() to take Event as a parameter. To be honest, I think KuduEventProducer is also a bad name. This thing reads Flume Events and produces Kudu Operations. Maybe a better name would be KuduOperationsProducer. Line 133: row.addBoolean(name, (boolean) value); > We could improve the API, but it would mean changing KuduSink and the KuduE If we use a single schema for the sink then we could probably do the validation elsewhere. However I think it would be useful to support per-Event schema URLs as well, or maybe instead of this. Here is what the Kite sink does. They allow putting an avro schema URL in each event using the header key "flume.avro.schema.url". See https://flume.apache.org/FlumeUserGuide.html#kite-dataset-sink They also allow an Avro JSON literal in each Event as "flume.avro.schema.literal" but I'm not sure how popular that is since it takes up so much space. http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java File java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java: Line 149: parameters.put(KuduSinkConfigurationConstants.PRODUCER, "org.apache.kudu.flume.sink.AvroKuduEventProducer"); nit: long line -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 74: hadoop-client > is this typical for Flume sinks? does this get shaded? Or would we expect i To be honest, I wasn't sure about this, so I copied off of the map/reduce integration's pom. I think we do expect it to be on the classpath. The dependency is only needed because I'd like users to be able to load the schema from HDFS, not just the local filesystem. http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 51: * producer.upsertfalseNoWhether to upsert or insert rows into Kudu. > - can you wrap these to multiple lines for better readability? Done PS1, Line 52: producer.schema > this sounds like it would be a schema rather than a path. perhaps schema_pa Done Line 80: FileSystem fs = FileSystem.get(path.toUri(), new Configuration()); > I think this should be path.getFileSystem(new Configuration()) Done Line 93: public void initialize(Event event, KuduTable table) { > unrelated suggestion: this is a very strange method name to me, since it ge Yup, it's weird. Line 126: String.format("Column %s is not nullable but was decoded as null", name)); > does this happen if you have a missing column? if so, I think a fallback to Done Line 133: row.addBoolean(name, (boolean) value); > all of these could generate ClassCastExceptions, right? Do we want to do an We could improve the API, but it would mean changing KuduSink and the KuduEventProducer interface, and therefore the other event producers. I don't think the user-facing configuration would need to change. Assuming I'm right, separate patch or broaden the scope of this one? -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/2982/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4034 to look at the new patch set (#2). Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time, as a file. Hopefully, with the regex EventProducer, this will provide users with a basic capability to stream into Kudu using Flume via configuration, no coding. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java A java/kudu-flume-sink/src/test/resources/testAvroEventProducer.avsc 4 files changed, 409 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/2 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Todd Lipcon has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/pom.xml File java/kudu-flume-sink/pom.xml: Line 74: hadoop-client is this typical for Flume sinks? does this get shaded? Or would we expect it to be on the classpath at runtime? Mike? http://gerrit.cloudera.org:8080/#/c/4034/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java: Line 51: * producer.upsertfalseNoWhether to upsert or insert rows into Kudu. - can you wrap these to multiple lines for better readability? - an enum like 'producer.operation' with 'insert' or 'upsert' would be more extensible (we could add insert-ignore later, for example) PS1, Line 52: producer.schema this sounds like it would be a schema rather than a path. perhaps schema_path? or schema.path? not sure if there si a convention to follow for flume sinks Line 80: FileSystem fs = FileSystem.get(path.toUri(), new Configuration()); I think this should be path.getFileSystem(new Configuration()) Line 93: public void initialize(Event event, KuduTable table) { unrelated suggestion: this is a very strange method name to me, since it gets called once per event. Maybe it's too late to change it, but maybe not? If we can't change it I think we should clarify the javadoc in the interface that this is called once per event, not once per producer. Line 126: String.format("Column %s is not nullable but was decoded as null", name)); does this happen if you have a missing column? if so, I think a fallback to _unset_ in kudu is preferable, so the kudu default takes precedence? I can't recall if Avro has explicit nulls different from 'missing' Line 133: row.addBoolean(name, (boolean) value); all of these could generate ClassCastExceptions, right? Do we want to do any up-front verification of the avro schema matching the target table schema? It's a little strange that you get the KuduTable object each time in initialize() instead of in configure(), which makes this tough to do. Can we improve the API or is it too late? -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Kudu Jenkins has posted comments on this change. Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/2972/ -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] Add AvroKuduEventProducer to Kudu-Flume integration
Will Berkeley has uploaded a new change for review. http://gerrit.cloudera.org:8080/4034 Change subject: Add AvroKuduEventProducer to Kudu-Flume integration .. Add AvroKuduEventProducer to Kudu-Flume integration This adds a KuduEventProducer that accepts events with Avro-encoded bodies and allows the KuduSink to push them to Kudu. It only accepts schemas of Record type that hold values of types compatible with Kudu's types. The schema is provided to the EventProducer ahead of time, as a file. Hopefully, with the regex EventProducer, this will provide users with a basic capability to stream into Kudu using Flume via configuration, no coding. Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 --- M java/kudu-flume-sink/pom.xml A java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduEventProducer.java A java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduEventProducerTest.java A java/kudu-flume-sink/src/test/resources/testAvroEventProducer.avsc 4 files changed, 383 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/4034/1 -- To view, visit http://gerrit.cloudera.org:8080/4034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6715df72e447e72f4801a2e026f6840d09b401e1 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley