So I definitely agree that we should have the integration tests handle
multi line inputs and binary inputs for that matter. How we did this for
pcap is to use sequence files as the storage format, but there are many
options.

On Tue, May 17, 2016 at 12:36 Kumar, Deeptaanshu <
[email protected]> wrote:

> Hi Metron Team,
>
> I misspoke earlier when I said the AD logs span multiple Kafka records. I
> meant to say that the way the Metron integration tests are currently
> setup, each line in the AD log is being treated as a separate Kafka
> record. I took a look at the code again and the readSampleData() method in
> TestUtils.java is reading each line in the AD log as a separate log. From
> here, the writeMessages() method in KafkaWithZKComponent.java is writing
> each line of the AD log to a different Kafka producer. If we could add
> code in either of these classes to handle multi-line logs, we would be
> able to fix this issue.
>
> I can join the AD records into a single line in my test logs, however, I
> will need to change the AD parser to handle one-line AD logs. Once I do
> that, the parser will pass the integration tests but will fail in
> production where the logs will be multi-line, not single-line. Jonathon
> Striley is correct, Nifi is configured to pass the entire multi-line AD
> log as one record to Kafka, which is why this parser is currently working
> in production.
>
> I just saw Ryan Merriman’s email, so should I continue this conversation
> with him outside of this dev list, or should I continue providing updates
> on this email thread?
>
> Sincerely,
>
> Deeptaanshu Kumar
> EDS ­ ISRM
> Data Engineer
> [email protected]
>
>
>
>
>
> On 5/17/16, 11:42 AM, "Casey Stella" <[email protected]> wrote:
>
> >Well, the problem is that those different kafka records that make up the
> >full AD line may end up on different workers (imagine a situation where
> >line 1 is on partition 1 and line 2 is on partition 2 and different storm
> >spout workers handle those partitions).  I'd recommend joining the AD
> >records prior to putting into kafka.
> >
> >On Tue, May 17, 2016 at 11:40 AM, Kumar, Deeptaanshu <
> >[email protected]> wrote:
> >
> >> Hi Metron Team,
> >>
> >> The Active Directory records span multiple Kafka records. The Active
> >> Directory logs come in multi-line format directly from the servers. If I
> >> remove the newlines from the test data, and alter the parser to pass the
> >> integration tests, the parser will fail when it tries to parse actual
> >> Active Directory logs. I think we may need to slightly alter the Metron
> >> code that handles the integration tests to deal with multi-line records.
> >> Please let me know how you want me to handle this issue.
> >>
> >> Sincerely,
> >>
> >> Deeptaanshu Kumar
> >> EDS ­ ISRM
> >> Data Engineer
> >> [email protected]
> >>
> >>
> >>
> >>
> >>
> >> On 5/17/16, 11:23 AM, "Casey Stella" <[email protected]> wrote:
> >>
> >> >Is a record spanning multiple kafka records (one record per line) or
> >>is it
> >> >just that your test data is multi-line?  If it's the former, then I
> >>think
> >> >you may have a problem.  If it's just the later, could you just remove
> >>the
> >> >newlines from your test data?
> >> >
> >> >On Tue, May 17, 2016 at 11:14 AM, Kumar, Deeptaanshu <
> >> >[email protected]> wrote:
> >> >
> >> >> Hi Metron Team,
> >> >>
> >> >> I am working on the Active Directory parser, and I have a question
> >>about
> >> >> the integration tests. Active Directory logs are multi-line logs, and
> >> >> currently, the Metron integration tests are configured to handle
> >> >> single-line logs so the integration tests fail for Active Directory.
> >>How
> >> >> would you recommend that I proceed with the integration tests for
> >>Active
> >> >> Directory logs? Should I modify code in the
> >>ParserIntegrationTest.java
> >> >>file
> >> >> to accommodate for multi-line logs?
> >> >>
> >> >> Sincerely,
> >> >>
> >> >> *Deeptaanshu Kumar*
> >> >> *EDS ­ ISRM *
> >> >> *Data Engineer*
> >> >> [email protected]
> >> >>
> >> >> ------------------------------
> >> >>
> >> >> The information contained in this e-mail is confidential and/or
> >> >> proprietary to Capital One and/or its affiliates and may only be used
> >> >> solely in performance of work or services for Capital One. The
> >> >>information
> >> >> transmitted herewith is intended only for use by the individual or
> >> >>entity
> >> >> to which it is addressed. If the reader of this message is not the
> >> >>intended
> >> >> recipient, you are hereby notified that any review, retransmission,
> >> >> dissemination, distribution, copying or other use of, or taking of
> >>any
> >> >> action in reliance upon this information is strictly prohibited. If
> >>you
> >> >> have received this communication in error, please contact the sender
> >>and
> >> >> delete the material from your computer.
> >> >>
> >>
> >> ________________________________________________________
> >>
> >> The information contained in this e-mail is confidential and/or
> >> proprietary to Capital One and/or its affiliates and may only be used
> >> solely in performance of work or services for Capital One. The
> >>information
> >> transmitted herewith is intended only for use by the individual or
> >>entity
> >> to which it is addressed. If the reader of this message is not the
> >>intended
> >> recipient, you are hereby notified that any review, retransmission,
> >> dissemination, distribution, copying or other use of, or taking of any
> >> action in reliance upon this information is strictly prohibited. If you
> >> have received this communication in error, please contact the sender and
> >> delete the material from your computer.
> >>
> >>
>
> ________________________________________________________
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>

Reply via email to