Well, the problem is that those different kafka records that make up the full AD line may end up on different workers (imagine a situation where line 1 is on partition 1 and line 2 is on partition 2 and different storm spout workers handle those partitions). I'd recommend joining the AD records prior to putting into kafka.
On Tue, May 17, 2016 at 11:40 AM, Kumar, Deeptaanshu < [email protected]> wrote: > Hi Metron Team, > > The Active Directory records span multiple Kafka records. The Active > Directory logs come in multi-line format directly from the servers. If I > remove the newlines from the test data, and alter the parser to pass the > integration tests, the parser will fail when it tries to parse actual > Active Directory logs. I think we may need to slightly alter the Metron > code that handles the integration tests to deal with multi-line records. > Please let me know how you want me to handle this issue. > > Sincerely, > > Deeptaanshu Kumar > EDS ISRM > Data Engineer > [email protected] > > > > > > On 5/17/16, 11:23 AM, "Casey Stella" <[email protected]> wrote: > > >Is a record spanning multiple kafka records (one record per line) or is it > >just that your test data is multi-line? If it's the former, then I think > >you may have a problem. If it's just the later, could you just remove the > >newlines from your test data? > > > >On Tue, May 17, 2016 at 11:14 AM, Kumar, Deeptaanshu < > >[email protected]> wrote: > > > >> Hi Metron Team, > >> > >> I am working on the Active Directory parser, and I have a question about > >> the integration tests. Active Directory logs are multi-line logs, and > >> currently, the Metron integration tests are configured to handle > >> single-line logs so the integration tests fail for Active Directory. How > >> would you recommend that I proceed with the integration tests for Active > >> Directory logs? Should I modify code in the ParserIntegrationTest.java > >>file > >> to accommodate for multi-line logs? > >> > >> Sincerely, > >> > >> *Deeptaanshu Kumar* > >> *EDS ISRM * > >> *Data Engineer* > >> [email protected] > >> > >> ------------------------------ > >> > >> The information contained in this e-mail is confidential and/or > >> proprietary to Capital One and/or its affiliates and may only be used > >> solely in performance of work or services for Capital One. The > >>information > >> transmitted herewith is intended only for use by the individual or > >>entity > >> to which it is addressed. If the reader of this message is not the > >>intended > >> recipient, you are hereby notified that any review, retransmission, > >> dissemination, distribution, copying or other use of, or taking of any > >> action in reliance upon this information is strictly prohibited. If you > >> have received this communication in error, please contact the sender and > >> delete the material from your computer. > >> > > ________________________________________________________ > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates and may only be used > solely in performance of work or services for Capital One. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the intended > recipient, you are hereby notified that any review, retransmission, > dissemination, distribution, copying or other use of, or taking of any > action in reliance upon this information is strictly prohibited. If you > have received this communication in error, please contact the sender and > delete the material from your computer. > >
