Re: Morphlines-solr-sink

2022-01-15 Thread Hari Shreedharan
+1 to removing it. Since the underlying project is no longer maintained, I
think we should be ok removing it.

On Sat, Jan 15, 2022 at 12:18 PM Ralph Goers 
wrote:

> I would like to see the data on the usage. I’m not sure how you would know
> since Cloudera doesn’t seem to include Flume in its products any more from
> what I can tell.
>
> The kite-morphines project consists of 18 sub-modules plug 4 aggregation
> modules. That is a heck of a lot of stuff to try to drag in. I would prefer
> to fork the parts of kite we would need to a new flume-kite repo.
>
> It seems that the CVE the reporter mentioned does have a fix. It is
> available in parquet-avro 1.11.2 and 1.12.2.  I was able to swap the new
> version for the old one even though the groupId has changed. That said, the
> kite-sdk dependency that includes it is marked as optional, so parquet-avro
> would be optional as well. So I have no idea if it is even used.
>
> In any case, the unit tests all pass with the updated dependency.
>
> Ralph
>
>
>
> > On Jan 14, 2022, at 3:33 PM, Tristan Stevens  wrote:
> >
> > -1 from me.
> >
> > First wee can’t do that in a patch release, but that’s semantics.
> >
> > Both the Morphlines interceptor and the  Morphlines-Solr-Sink are
> components that are widely used amongst the community. I did some analysis
> last year that I’ll dig out and share, but they are two of the  most used
> components after HDFS sink, Kafka and JMS.
> >
> > Whilst I agree it’s sucky that Cloudera aren’t supporting Kite anymore,
> I wonder whether we can find a way to bring Morphlines into here, or
> otherwise get upstream and fix the bits that need fixing.
> >
> > Tristan
> >
> >
> > From: Ralph Goers   ralph.go...@dslextreme.com>
> > Reply: dev@flume.apache.org  <
> dev@flume.apache.org> 
> > Date: 13 January 2022 at 15:26:12
> > To: dev@flume.apache.org   dev@flume.apache.org>
> > Subject:  Morphlines-solr-sink
> >
> >> While I am not having any trouble building the morphline-solr-sink
> component, it is dependent on the abandoned kite-sdk, which makes its life
> very limited.
> >>
> >> In addition, the kite-sdk has a dependency on parquet-avro which,
> according to https://issues.apache.org/jira/browse/FLUME-3403, has
> vulnerabilities in every available release.
> >>
> >> Due to these factors I am going to remove the morphline-solr-sink
> module from Flume for the 1.10.0 release.
> >>
> >> Ralph
>
>


Re: Breaking changes in Flume - 2.0 release

2018-01-30 Thread Hari Shreedharan
Agreed. Another change we might want to consider is shading the
dependencies of individual modules and the framework itself. This will make
it easier to upgrade individual modules.

On Mon, Jan 29, 2018 at 2:58 PM, Mike Percy  wrote:

> I think this change is important enough to warrant a breaking version bump.
> Virtually every project depends on Guava, and Flume should certainly
> shade/rename Guava for its internal use.
>
> I'd be on board with doing a Flume 2.0 to remove Guava from the public API.
>
> Mike
>
> On Mon, Jan 29, 2018 at 2:31 AM, Denes Arvay  wrote:
>
> > Hi Flume Community,
> >
> > It has come up a couple of times that we should fully remove (or at least
> > shade) Guava from Flume, but unfortunately it's part of our public API
> > [1,2,3].
> > As a first step I worked on FLUME-2957 [4] and created a pull request
> [5].
> > Mike has already reviewed it, thanks for that. As he pointed out it is a
> > breaking change thus it can be included in Flume 2.0 release only.
> >
> > Flume 2.0 will be a good opportunity to introduce other breaking changes
> as
> > well, so we might want to start collecting the features, improvements,
> > other possible changes.
> >
> > What do you think?
> >
> > Best,
> > Denes
> >
> > [1]
> > https://github.com/apache/flume/blob/trunk/flume-ng-
> > configuration/src/main/java/org/apache/flume/Context.java#L51
> > [2]
> > https://github.com/apache/flume/blob/trunk/flume-ng-
> > node/src/main/java/org/apache/flume/node/MaterializedConfiguration.
> > java#L41
> > [3]
> > https://github.com/apache/flume/blob/trunk/flume-ng-
> > node/src/main/java/org/apache/flume/node/SimpleMaterializedConfiguratio
> > n.java#L64
> > [4] https://issues.apache.org/jira/browse/FLUME-2957
> > [5] https://github.com/apache/flume/pull/195
> >
>


[ANNOUNCE] New Flume PMC Chair

2018-01-18 Thread Hari Shreedharan
Hi all,

It gives me immense happiness to announce that the Apache Software
Foundation Board has appointed Mike Percy as the new PMC chair of the
Apache Flume Project. Mike has contributed immensely to the project, and is
one of the most active contributors to the project.

I am confident that Mike will do an amazing job as the chair of the PMC.
Please join me in congratulating Mike and welcoming his to this new role!

Thanks,
Hari


Re: Squash commits on trunk

2018-01-09 Thread Hari Shreedharan
I don't have any objections to that, but I have to wonder if it makes sense
to update the guidelines to actually not have to squash commits. I think
the reason we needed to squash those commits was that we were originally on
SVN and having multiple commits didn't make much sense in SVN. It is easy
to track history with a single commit, but that looks to be the case anyway
(I just see 1 merge commit, which is fine - it is an artifact of pull
request merges).

That said, I don't have an objection to force-pushing, we just need to make
sure no history is lost.

On Tue, Jan 9, 2018 at 1:03 AM, Denes Arvay  wrote:

> Hi Flume Community,
>
> A couple of commits went in to trunk recently which weren't in line with
> our commit guidelines.
> I suggest to squash these commits to one and do a force push to resolve
> this issue, plus - as the guidelines are not clear enough - I'd like to
> extend the
> https://github.com/apache/flume/blob/trunk/dev-docs/HowToCommit.md doc to
> be more concrete on the requirements for a commit. These rules are
> currently mostly unwritten, so it'd be useful to clarify them.
>
> I'm happy to do these if there is no objection from the community.
>
> Regards,
> Denes
>


Re: Message Lists

2017-12-11 Thread Hari Shreedharan
+1. I agree, we should move the private messages out.

On Sun, Dec 10, 2017 at 12:14 PM, Mike Percy  wrote:

> If necessary due to noise we can take this discussion back to private@
> for a check-in.
>
> Mike
>
> Sent from my iPhone
>
> > On Dec 10, 2017, at 9:15 AM, Ralph Goers 
> wrote:
> >
> > Thanks Mike. Yours is the only feedback in a month. I am uncomfortable
> contacting infra to make the changes based on so little input.
> >
> > Ralph
> >
> >> On Dec 8, 2017, at 9:03 PM, Mike Percy  wrote:
> >>
> >> Sorry, I didn't see this message because of all the automated emails!
> >>
> >> +1 from me.
> >>
> >> Mike
> >>
> >> On Sat, Nov 11, 2017 at 9:43 PM, Ralph Goers <
> ralph.go...@dslextreme.com>
> >> wrote:
> >>
> >>> Currently all the messages from Jenkins, Jira and GitHub land in this
> >>> mailing list. That makes this mailing list very cluttered and it is
> easy to
> >>> miss discussions. Other projects use a “notifications” list to accept
> >>> emails from those sources so the dev list can be left for
> person-to-person
> >>> discussions.  I would like to propose that Flume switch to this model.
> >>> Another alternative would be to have separate lists for each of those
> >>> sources. My personal viewpoint is simply separating the automated
> emails is
> >>> enough but I’d be willing to go along with any plan that moves the
> >>> automated emails to another list.  I also think that doing this might
> >>> increase the number of subscribers on the Flume dev list as lots of
> people
> >>> don’t like to deal with all the extra email.
> >>>
> >>> All that said, it would be expected that all committers would
> subscribe to
> >>> these new lists.
> >>>
> >>> Thoughts?
> >>>
> >>> Ralph
> >>>
> >
> >
>
>


Re: [VOTE] Release Apache Flume version 1.8.0 RC2

2017-09-22 Thread Hari Shreedharan
Sorry for not voting yet. I can take a look early next week. If there are
enough binding votes before that, please push the release!

On Thu, Sep 21, 2017 at 3:56 AM, Mike Percy  wrote:

> +1 (binding)
>
> I only checked the source artifact, I didn't check the binary convenience
> artifacts.
>
> - Checksums and sigs match
> - LICENSE file looks good
> - README file looks good
> - Files match git tag
> - Ran a full build and all tests passed
>
> Thanks for managing this release Denes, and to the others that helped!
>
> Best,
> Mike
>
> On Fri, Sep 15, 2017 at 10:52 AM, Denes Arvay  wrote:
>
> > Hi Flume Community,
> >
> > This is the eleventh release for Apache Flume as a top-level project,
> > version 1.8.0. We are voting on release candidate RC2.
> >
> > It fixes the following issues:
> >   https://raw.githubusercontent.com/apache/flume/release-1.8.0
> > -rc2/CHANGELOG
> >
> > *** Please cast your vote within the next 72 hours ***
> >
> > The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> > for
> > the source and binary artifacts can be found here:
> >   http://people.apache.org/~denes/apache-flume-1.8.0-rc2/
> >
> > Maven staging repo:
> >   https://repository.apache.org/content/repositories/
> orgapacheflume-1026/
> >
> > The tag to be voted on:
> >   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=43a3c40
> >
> > Flume's KEYS file containing PGP keys we use to sign the release:
> >   https://www.apache.org/dist/flume/KEYS
> >
> > Thank you,
> > Denes
> >
>


Re: Update Jenkins configuration to use Java 8

2017-07-04 Thread Hari Shreedharan
I switched the Java for trunk build to Java 8. Let me know if that does not
work

On Tue, Jul 4, 2017 at 1:04 AM, Hari Shreedharan <hshreedha...@apache.org>
wrote:

> It looks like that script has been superseded by Whimsy, but I don't see
> support for this. Let me check with infra folks.
>
> Thanks
> Hari
>
> On Mon, Jul 3, 2017 at 6:49 AM, Denes Arvay <de...@cloudera.com> wrote:
>
>> Hi Community,
>>
>> A recent commit where the source and target versions were changed to 1.8
>> broke the Jenkins build.
>> Could a committer/PMC member with access to Jenkins update its
>> configuration? Or moreover I'd happily do it myself if I had the proper
>> permission for it. According to this documentation PMC Chairs can grant
>> access to any committer: https://cwiki.apach
>> e.org/confluence/display/INFRA/Jenkins, Hari, may I ask you to grant the
>> required permission to me?
>>
>> Thanks in advance,
>> Denes
>>
>
>


Re: Update Jenkins configuration to use Java 8

2017-07-04 Thread Hari Shreedharan
It looks like that script has been superseded by Whimsy, but I don't see
support for this. Let me check with infra folks.

Thanks
Hari

On Mon, Jul 3, 2017 at 6:49 AM, Denes Arvay  wrote:

> Hi Community,
>
> A recent commit where the source and target versions were changed to 1.8
> broke the Jenkins build.
> Could a committer/PMC member with access to Jenkins update its
> configuration? Or moreover I'd happily do it myself if I had the proper
> permission for it. According to this documentation PMC Chairs can grant
> access to any committer: https://cwiki.apache.org/confluence/display/
> INFRA/Jenkins, Hari, may I ask you to grant the required permission to me?
>
> Thanks in advance,
> Denes
>


Re: Travis-CI build hung

2017-02-25 Thread Hari Shreedharan
Looks like it passed. Must have been Travis being under provisioned or
something

On Feb 24, 2017 3:32 PM, "Mike Percy"  wrote:

> The Travis pre-commit build request for PR #109 seems hung:
> https://travis-ci.org/apache/flume/builds/204857948
>
> Does anybody know if I need to have some special permissions to cancel that
> request and resubmit?
>
> Thanks,
> Mike
>


[jira] [Commented] (FLUME-3049) Wrapping the exception into SecurityException in UGIExecutor.execute hides the original one

2017-01-26 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840091#comment-15840091
 ] 

Hari Shreedharan commented on FLUME-3049:
-

I think that is a bug and was an oversight when I reviewed it. I think we 
should throw the actual exception itself. Please go ahead and throw the actual 
exception and removing the {{SecurityException}}

> Wrapping the exception into SecurityException in UGIExecutor.execute hides 
> the original one
> ---
>
> Key: FLUME-3049
> URL: https://issues.apache.org/jira/browse/FLUME-3049
> Project: Flume
>  Issue Type: Bug
>Reporter: Denes Arvay
>
> see: 
> https://github.com/apache/flume/blob/trunk/flume-ng-auth/src/main/java/org/apache/flume/auth/UGIExecutor.java#L49
> This has unexpected side effects as the callers try to catch the wrapped 
> exception, for example in {{BucketWriter.append()}}: 
> https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L563
> I don't know the original intend behind this wrapping, [~jrufus] or 
> [~hshreedharan], do you happen to remember? (You were involved in the 
> original implementation in FLUME-2631)
> Right now I don't see any problem in removing this and letting the original 
> exception to propagate as the {{org.apache.flume.auth.SecurityException}} 
> doesn't appear anywhere in the public interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-3020) Improve HDFSEventSink Escape Ingestion by more then 10x by not getting InetAddress on every record

2016-11-01 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626313#comment-15626313
 ] 

Hari Shreedharan commented on FLUME-3020:
-

This looks good. We could actually cache even the replacement strings in local 
static variables to save that cost as well (which of course is trivial compared 
to the lookup cost). 

> Improve HDFSEventSink Escape Ingestion by more then 10x by not getting 
> InetAddress on every record
> --
>
> Key: FLUME-3020
> URL: https://issues.apache.org/jira/browse/FLUME-3020
> Project: Flume
>  Issue Type: Improvement
>Reporter: Theodore michael Malaska
>Assignee: Theodore michael Malaska
> Attachments: flume-3020.patch
>
>
> If you are use escaping the current code will call InetAddress on every 
> record.  Which will result is a huge impact to performance.
> TotalTime,8403,
> totalEventTakeTime,1498,
> totalWriteTime,1981,
> totalWriterSetupTime,65,
> commitTime,201,
> flushTime,18,
> startTrans,7,
> The rest is all InetAddress



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] Apache Flume 1.7.0 released

2016-10-19 Thread Hari Shreedharan
Thanks Donat and Mike! Great work!

On Wed, Oct 19, 2016 at 3:06 AM, Mike Percy  wrote:
> Thank you for all your hard work RMing this release Donat and getting it to
> the finish line. It was my pleasure to help out!
>
> Best,
> Mike
>
> On Tue, Oct 18, 2016 at 1:32 PM, Balazs Donat Bessenyei > wrote:
>
>> And thank you, Mike Percy for the mentoring and tremendous amounts of
>> assistance with the release!
>>
>> On Tue, Oct 18, 2016 at 12:46 PM, Balazs Donat Bessenyei
>>  wrote:
>> > Thank you all very much who participated and helped with the release!
>> >
>> >
>> > Donat
>> >
>> >
>> > On Tue, Oct 18, 2016 at 12:37 PM, Mike Percy  wrote:
>> >> Woot! Congrats everyone!
>> >>
>> >> Thanks Donat for working so hard to get this version of Flume out the
>> door!
>> >>
>> >> Best,
>> >> Mike
>> >>
>> >>
>> >> On Tue, Oct 18, 2016 at 10:09 AM, Bessenyei Balázs Donát <
>> bes...@apache.org>
>> >> wrote:
>> >>>
>> >>> The Apache Flume team is pleased to announce the release of Flume
>> >>> version 1.7.0.
>> >>>
>> >>> Flume is a distributed, reliable, and available service for efficiently
>> >>> collecting, aggregating, and moving large amounts of log data.
>> >>>
>> >>> This release can be downloaded from the Flume download page at:
>> >>> http://flume.apache.org/download.html
>> >>>
>> >>> The change log and documentation are available on the 1.7.0 release
>> page:
>> >>> http://flume.apache.org/releases/1.7.0.html
>> >>>
>> >>> Your help and feedback is more than welcome. For more information on
>> how
>> >>> to report problems and to get involved, visit the project website at
>> >>> http://flume.apache.org/
>> >>>
>> >>> The Apache Flume Team
>> >>
>> >>
>>


Re: Enabling Travis-CI on Flume

2016-10-14 Thread Hari Shreedharan
I think the free version of Travis has a 50 minute limit on the total build
time and last I checked flume full build took longer than that. Is there a
way to workaround that? I think I have a branch somewhere on GitHub that I
experimented with.

One interesting possibility with Travis is to integrate code coverage
reports into PR builds, which is pretty nice.
On Fri, Oct 14, 2016 at 3:09 AM Attila Simon  wrote:

> +1 on jenkins, and keeping our build infra as simple and intuitive as
> possible
> If migrating everything - we currently do with Jenkins - to Travis and
> abandon Jenkins than I would be fine with that as well.
>
>
> *Attila Simon*
> Software Engineer
> Email:   s...@cloudera.com
>
> [image: Cloudera Inc.]
>
> On Fri, Oct 14, 2016 at 10:09 AM, Balazs Donat Bessenyei <
> bes...@cloudera.com> wrote:
>
> > My primary reason for Travis (vs. Jenkins) was that I have experience
> with
> > it.
> >
> > And it leaves these happy little checkmarks:
> > https://github.com/sebastianbergmann/phpunit/pull/1051/commits on the
> > commits and messages as seen on
> > https://github.com/apache/hive/pull/107 .
> >
> > Jenkins is probably configurable to achieve similar function. However,
> > I have no idea how to do such. (And could not find an example when I
> > did a quick search.)
> >
> > Are there any disadvantages of enabling Travis on Flume?
> >
> >
> > Thank you,
> >
> > Donat
> >
> > On Thu, Oct 13, 2016 at 6:06 PM, Lior Zeno  wrote:
> > > Jenkins can do PRs as well. If we can upgrade Jenkins to 2.0, we will
> be
> > > able to define the build step via Jenkinsfile which becomes very
> similar
> > to
> > > Travis.
> > > Is there any reason to prefer Travis over Jenkins in our case?
> > >
> > > On Thu, Oct 13, 2016 at 7:01 PM, Balazs Donat Bessenyei <
> > bes...@cloudera.com
> > >> wrote:
> > >
> > >> Hi All,
> > >>
> > >> Having something that checks proposed patches (PR-s especially)
> > >> automatically would help a lot with the development on Flume.
> > >>
> > >> I think, Travis-CI could be an easy solution and (afaik) we'd only
> have
> > to
> > >> ask infra to enable it for us.
> > >>
> > >> Please, let me know your thoughts.
> > >>
> > >> Thank you,
> > >>
> > >> Donat
> > >>
> >
>


Re: [VOTE] Release Apache Flume version 1.7.0 RC2

2016-10-13 Thread Hari Shreedharan
+1 (binding)

Signatures and checksums look good
Top level files are all good.
Build runs fine, simple agent with Seq source, memory channel and HDFS
sink run fine as well.

On Thu, Oct 13, 2016 at 8:42 AM, Mike Percy  wrote:
> +1 (binding)
>
> There are some flaky tests which are listed below but I don't think they
> are release blockers.
>
> I performed the following checks:
>
> Binary convenience artifact:
> * Signature and checksums match
> * LICENSE, NOTICE, and README.md files in the binary convenience artifact
> look accurate and complete relative to the jars in lib/
> * Ran a very quick test with the binary artifact and it
> worked: ./bin/flume-ng agent -c conf -f
> conf/flume-conf.properties.template -n
> agent -Dflume.root.logger=DEBUG,console
> * Checked that the documentation in docs/ renders: Flume User Guide and
> Flume Dev Guide are OK. Also spot-checked that the new Kafka security
> documentation was included in the User Guide
>
> Source artifact:
> * Signature and checksums match
> * Built Flume from the source artifact using Oracle 1.7.0_80 on Ubuntu
> Linux 16.04, sanity tested the resulting binary using the above method and
> it worked
> * RAT checks passed
> * Built a new source artifact out of the official source artifact and
> compiled it
> * I ran the unit tests. Most passed but the below two failed. These are
> flaky tests (we have a bunch of them in Flume) so I think it's fine not to
> block the release on them.
>   * TestExecSource.testMonitoredCounterGroup - looks like a racy test
>   * TestSpillableMemoryChannel - didn't investigate
>
> RC2 looks good to me.
>
> Thanks for running this release, Donat!
>
> Mike
>
> On Wed, Oct 12, 2016 at 9:29 PM, Balazs Donat Bessenyei > wrote:
>
>> Hi All,
>>
>> This is the tenth release for Apache Flume as a top-level project,
>> version 1.7.0. We are voting on release candidate RC2.
>>
>> It fixes the following issues:
>>   https://raw.githubusercontent.com/apache/flume/flume-1.7/CHANGELOG
>>
>> *** Please cast your vote within the next 72 hours ***
>>
>> The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
>> for the source and binary artifacts can be found here:
>>   http://people.apache.org/~bessbd/apache-flume-1.7.0-rc2/
>>
>> Maven staging repo:
>>   https://repository.apache.org/content/repositories/orgapacheflume-1020/
>>
>> The tag to be voted on:
>>   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=511d868
>>
>> Flume's KEYS file containing PGP keys we use to sign the release:
>>   https://www.apache.org/dist/flume/KEYS
>>
>>
>> Thank you,
>>
>> Donat
>>


Re: [ANNOUNCE] Two new Flume committers

2016-09-20 Thread Hari Shreedharan
Congrats Donat and Jeff!

On Tue, Sep 20, 2016 at 12:08 AM Lior Zeno  wrote:

> Congratulations Bessenyei and Jeff. Keep up your great work!
>
> On Sep 20, 2016 01:43, "Mike Percy"  wrote:
>
> > Hi Apache Flume community,
> >
> > I am very happy to announce that the Flume PMC has voted to add Bessenyei
> > Balázs Donát and Jeff Holoman as committers in recognition of their
> > contributions to Flume.
> >
> > Over the past few months, Donat has contributed and reviewed many
> patches,
> > more than any non-committer. He has contributed several bug fixes and
> > improvements and has shepherded important, long-forgotten patches through
> > the review and commit process, with more  in-progress. He is also
> currently
> > working on improvements to the Flume configuration system.
> >
> > Jeff has contributed several important improvements to Flume in recent
> > months, including adding support for secure Kafka to Flume, improving the
> > AvroEventSerializer, and adding additional smarts to the HDFS sink.
> >
> > Please join me in congratulating them on their new committership!
> >
> > Best regards,
> > Mike
> >
>


[jira] [Commented] (FLUME-2952) SyslogAgent possible NPE on stop()

2016-07-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376073#comment-15376073
 ] 

Hari Shreedharan commented on FLUME-2952:
-

+1. Please feel free to commit.

> SyslogAgent possible NPE on stop()
> --
>
> Key: FLUME-2952
> URL: https://issues.apache.org/jira/browse/FLUME-2952
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Mike Percy
>Assignee: Mike Percy
> Fix For: v1.7.0
>
> Attachments: FLUME-2952.patch
>
>
> If the SyslogAgent fails to start, calling close() will result in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2922) HDFSSequenceFile Should Sync Writer

2016-07-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375732#comment-15375732
 ] 

Hari Shreedharan commented on FLUME-2922:
-

No issues at all [~mpercy]. I have not had the time recently to do reviews. 
[This|https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py] is the 
script that Spark uses to merge and 
[this|https://github.com/apache/spark/blob/master/dev/github_jira_sync.py] is 
the one to link the PR to jira.

> HDFSSequenceFile Should Sync Writer
> ---
>
> Key: FLUME-2922
> URL: https://issues.apache.org/jira/browse/FLUME-2922
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Kevin Conaway
>Priority: Critical
> Attachments: FLUME-2922.patch
>
>
> There is a possibility of losing data with the current HDFS sequence file 
> writer.
> Internally, the `SequenceFile.Writer` buffers data and periodically syncs it 
> to the underlying output stream.  The mechanism for doing this is dependent 
> on whether you are using compression or not but in both scenarios, the 
> key/values are appended to an internal buffer and only flushed to disk after 
> the buffer reaches a certain size.
> Thus it is quite possible for Flume to lose messages if the agent crashes, or 
> is stopped, before the internal buffer is flushed to disk.
> The correct action is to force the writer to sync its internal buffers to the 
> underlying `FSDataOutputStream` first before calling hflush/sync.
> Additionally, I believe we should be calling hsync instead of hflush.  Its my 
> understanding writes with hsync should be more durable which I believe are 
> the semantics we want here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2922) HDFSSequenceFile Should Sync Writer

2016-07-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375668#comment-15375668
 ] 

Hari Shreedharan commented on FLUME-2922:
-

Actually, since you can't merge via github (ASF does not give us write access 
via github) - we basically need to pull the branch into our local master and 
then commit it, but I think it will still bring in multiple commits. Spark does 
have a script that squashes the commits and then pushes it (and does some jira 
linking too).

> HDFSSequenceFile Should Sync Writer
> ---
>
> Key: FLUME-2922
> URL: https://issues.apache.org/jira/browse/FLUME-2922
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Kevin Conaway
>Priority: Critical
> Attachments: FLUME-2922.patch
>
>
> There is a possibility of losing data with the current HDFS sequence file 
> writer.
> Internally, the `SequenceFile.Writer` buffers data and periodically syncs it 
> to the underlying output stream.  The mechanism for doing this is dependent 
> on whether you are using compression or not but in both scenarios, the 
> key/values are appended to an internal buffer and only flushed to disk after 
> the buffer reaches a certain size.
> Thus it is quite possible for Flume to lose messages if the agent crashes, or 
> is stopped, before the internal buffer is flushed to disk.
> The correct action is to force the writer to sync its internal buffers to the 
> underlying `FSDataOutputStream` first before calling hflush/sync.
> Additionally, I believe we should be calling hsync instead of hflush.  Its my 
> understanding writes with hsync should be more durable which I believe are 
> the semantics we want here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: branhes: flume-1.7 vs trunk

2016-07-13 Thread Hari Shreedharan
+1. Let's get rid of the 2 branch policy. I think we can now technically
force-push to all branches, but we should probably still avoid it. Please
go ahead and sync 1.7 with trunk.

On Wed, Jul 13, 2016 at 1:11 PM Mike Percy  wrote:

> On Wed, Jul 13, 2016 at 1:08 PM, Mike Percy  wrote:
> >
> > PS: In the meantime, I will cherry-pick all of my recent commits to the
> > flume-1.7 branch. I just don't think we should create a flume-1.8 branch
> > once we release Flume 1.7.0
> >
>
> Actually, my preference would be to force-push the flume-1.7 branch so that
> it exactly matches trunk. IIRC, the only branch we cannot force-push is
> trunk, unless that ASF git configuration has changed recently.
>
> Mike
>


Re: Jenkins build is back to normal : Flume-trunk-hbase-1 #178

2016-07-12 Thread Hari Shreedharan
w00t!

On Tue, Jul 12, 2016 at 3:43 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See 
>
>


[jira] [Commented] (FLUME-2725) HDFS Sink does not use configured timezone for rounding

2016-07-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368626#comment-15368626
 ] 

Hari Shreedharan commented on FLUME-2725:
-

I agree it is a bug. I don't think we actually need it to be backwards 
compatible - it is a bug fix, we don't need bug fixes to be backwards 
compatible.

> HDFS Sink does not use configured timezone for rounding
> ---
>
> Key: FLUME-2725
> URL: https://issues.apache.org/jira/browse/FLUME-2725
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Reporter: Eric Czech
>Assignee: Denes Arvay
>Priority: Minor
> Attachments: FLUME-2725.patch
>
>
> When a BucketPath used by an HDFS sink is configured to run with some 
> roundUnit and roundValue > 1 (e.g. 6 hours), the "roundDown" function used by 
> BucketPath does not actually round the date correctly.
> That function calls TimestampRoundDownUtil which creates a Calendar instance 
> using the *local* timezone to truncate a unix timestamp rather than the 
> TimeZone that the sink was configured to convert dates to paths with (and 
> that timezone is already available in the BucketPath class but it just isn't 
> passed to TimestampRoundDownUtil).
> The net effect of this is that if a flume jvm is running on a system with an 
> EST clock while trying to write, say, 6 hour directories in UTC time, the 
> directories are written with the hours 04, 10, 16, 22 rather than 00, 06, 12, 
> 18 like you would expect.
> I found a workaround for this by passing 
> "-Duser.timezone=" as a system property, but I wanted to 
> create a ticket for this since it seems like it would be very minimal effort 
> to carry that configured timezone down into the rounding utility as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2941) Integrate checkstyle for test classes

2016-07-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368610#comment-15368610
 ] 

Hari Shreedharan commented on FLUME-2941:
-

+1. Please go ahead and commit!

> Integrate checkstyle for test classes
> -
>
> Key: FLUME-2941
> URL: https://issues.apache.org/jira/browse/FLUME-2941
> Project: Flume
>  Issue Type: Improvement
>Reporter: Lior Zeno
>Assignee: Mike Percy
>Priority: Minor
> Fix For: v1.7.0
>
>
> We should add the maven-checkstyle-plugin to the build process. This plugin 
> can fail a build if the code does not honor the style of our project. This 
> way we can make sure that we have one common style in the code. In addition, 
> reviewers can focus on design, correctness, performance and other important 
> coding aspects other than style issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49830: FLUME-2941. Integrate checkstyle for test classes

2016-07-08 Thread Hari Shreedharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49830/#review141357
---


Ship it!




- Hari Shreedharan


On July 8, 2016, 10:04 p.m., Mike Percy wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49830/
> ---
> 
> (Updated July 8, 2016, 10:04 p.m.)
> 
> 
> Review request for Flume and Hari Shreedharan.
> 
> 
> Bugs: FLUME-2941
> https://issues.apache.org/jira/browse/FLUME-2941
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch makes the Flume test code conform to the Google style guidelines.
> 
> This patch also makes all future style violations fatal to the build.
> 
> This patch is whitespace-only from a code perspective. After stripping
> line numbers, the generated test bytecode before and after these changes
> is identical.
> 
> 
> Diffs
> -
> 
>   flume-checkstyle/pom.xml 31db3c0 
>   flume-checkstyle/src/main/resources/flume/checkstyle-suppressions.xml 
> 49c8834 
>   flume-checkstyle/src/main/resources/flume/checkstyle.xml e8913f0 
>   
> flume-ng-auth/src/test/java/org/apache/flume/auth/TestFlumeAuthenticator.java 
> 5a8860d 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/CountingSinkRunner.java
>  0733dc4 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/CountingSourceRunner.java
>  b6abc35 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java
>  c1de12e 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestEventQueueBackingStoreFactory.java
>  52c706d 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestEventUtils.java
>  c72e3f2 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
>  bb22e26 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelFormatRegression.java
>  c95122b 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelRestart.java
>  d5fe6fb 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelRollback.java
>  23fc64b 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java
>  1adb21a 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestIntegration.java
>  2fbe116 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java
>  b1f59cd 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java
>  976a112 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV2.java
>  2356d90 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV3.java
>  eb0ce04 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestUtils.java
>  61f38d2 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/encryption/CipherProviderTestSuite.java
>  530ccf6 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/encryption/EncryptionTestUtils.java
>  6ca3246 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/encryption/TestAESCTRNoPaddingProvider.java
>  a7c7cb2 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/encryption/TestFileChannelEncryption.java
>  d4537a8 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/encryption/TestJCEFileKeyProvider.java
>  f33cada 
>   
> flume-ng-channels/flume-jdbc-channel/src/test/java/org/apache/flume/channel/jdbc/BaseJdbcChannelProviderTest.java
>  85ad7fe 
>   
> flume-ng-channels/flume-jdbc-channel/src/test/java/org/apache/flume/channel/jdbc/MockEvent.java
>  1e412c5 
>   
> flume-ng-channels/flume-jdbc-channel/src/test/java/org/apache/flume/channel/jdbc/MockEventUtils.java
>  10d8b51 
>   
> flume-ng-channels/flume-jdbc-channel/src/test/java/org/apache/flume/channel/jdbc/TestDerbySchemaHandlerQueries.java
>  362bcfa 
>   
> f

[jira] [Commented] (FLUME-2937) Integrate checkstyle

2016-06-29 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356269#comment-15356269
 ] 

Hari Shreedharan commented on FLUME-2937:
-

I just +1-ed the patch. 

[~mpercy] - Feel free to commit it.

> Integrate checkstyle
> 
>
> Key: FLUME-2937
> URL: https://issues.apache.org/jira/browse/FLUME-2937
> Project: Flume
>  Issue Type: Improvement
>Reporter: Lior Zeno
>Assignee: Mike Percy
>Priority: Minor
> Fix For: v1.7.0
>
>
> We should add the maven-checkstyle-plugin to the build process. This plugin 
> can fail a build if the code does not honor the style of our project. This 
> way we can make sure that we have one common style in the code. In addition, 
> reviewers can focus on design, correctness, performance and other important 
> coding aspects other than style issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49403: FLUME-2937. Integrate checkstyle for non-test code

2016-06-29 Thread Hari Shreedharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49403/#review140084
---


Ship it!




I didn't look at the entire patch, but I like what I saw (~10%), but since the 
bytecode is the same - we should commit this.

- Hari Shreedharan


On June 29, 2016, 11:45 p.m., Mike Percy wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49403/
> ---
> 
> (Updated June 29, 2016, 11:45 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2937
> https://issues.apache.org/jira/browse/FLUME-2937
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch runs checkstyle as part of the Maven build and fixes existing 
> files to adhere to that style. This patch is only for the runtime code. It 
> does not include the unit test code.
> 
> The style being used is the Google Java style, with some minor loosening to 
> be close to the style that the code is mostly already written in.
> 
> 
> Diffs
> -
> 
>   flume-checkstyle/pom.xml PRE-CREATION 
>   flume-checkstyle/src/main/resources/flume/checkstyle-suppressions.xml 
> PRE-CREATION 
>   flume-checkstyle/src/main/resources/flume/checkstyle.xml PRE-CREATION 
>   
> flume-ng-auth/src/main/java/org/apache/flume/api/SecureRpcClientFactory.java 
> c976458 
>   flume-ng-auth/src/main/java/org/apache/flume/api/SecureThriftRpcClient.java 
> f31582c 
>   
> flume-ng-auth/src/main/java/org/apache/flume/auth/FlumeAuthenticationUtil.java
>  5627652 
>   
> flume-ng-auth/src/main/java/org/apache/flume/auth/KerberosAuthenticator.java 
> 4a0e0f4 
>   flume-ng-auth/src/main/java/org/apache/flume/auth/KerberosUser.java dd37721 
>   flume-ng-auth/src/main/java/org/apache/flume/auth/SimpleAuthenticator.java 
> f7b5bea 
>   flume-ng-auth/src/main/java/org/apache/flume/auth/UGIExecutor.java cd62b91 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/BadCheckpointException.java
>  588506a 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
>  b961ae2 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Commit.java
>  3663244 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CorruptEventException.java
>  691d291 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFactory.java
>  456df34 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFile.java
>  2b0987b 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV2.java
>  abd2ea3 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV3.java
>  9dfa0d1 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventUtils.java
>  ff5242a 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
>  ed2b996 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannelConfiguration.java
>  5c3c48f 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEvent.java
>  53c1251 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventPointer.java
>  5f06ab7 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java
>  d305f4d 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
>  247c287 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
>  488dcf4 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileFactory.java
>  7d7fd85 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileRetryableIOException.java
>  9447652 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV2.java
>  bb25e95 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV3.java
>  9b0ef93 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogRecord.java
&

Re: new contributor

2016-06-29 Thread Hari Shreedharan
Done.

On Wed, Jun 29, 2016 at 1:25 AM Denes Arvay  wrote:

> Hi All,
>
> I'd like to contribute to Flume, could you please add me to the Jira
> project?
>
> Thanks,
> Denes
>


Re: [DISCUSS] Checkstyle maven plugin

2016-06-27 Thread Hari Shreedharan
+10 :-)

I am in complete agreement with Mike. It is usually painful for the
reviewer and the developer to go back and fix style issues in each review.
I am not sure if a precommit hook will suffice, since we don't actually run
pre-commit builds. We will probably need to add it to the full build so the
developer can figure out the issues before even submitting the patch for
review.

On Fri, Jun 24, 2016 at 11:09 PM Lior Zeno  wrote:

> +1.
> We had a thread about it here:
>
> http://mail-archives.apache.org/mod_mbox/flume-dev/201606.mbox/raw/%3CCAA6RhS9%2BzsJ8GNom3FSjB7MN_Zb2aWfOSXxh_RC-MuvhAfQC7g%40mail.gmail.com%3E/1
> In addition, I created a jira issue for that. I hope we can add this to
> 1.8.0: https://issues.apache.org/jira/browse/FLUME-2937
>
> On Sat, Jun 25, 2016 at 1:11 AM, Ashish  wrote:
>
> > +1
> >
> > On Fri, Jun 24, 2016 at 2:24 PM, Mike Percy  wrote:
> > > Hey devs,
> > > Code nitpicks have come up a bit lately (in code I'm the reviewer of).
> > > Other Apache projects such as HBase and Kafka use checkstyle to do a
> > > pre-commit check at build time. Rather than spend time going back and
> > forth
> > > on code style, how about we adopt the checkstyle plugin for Flume?
> > >
> > >  I'd propose adopting the Google Java style. It's what the vast
> majority
> > of
> > > the Flume code uses today, and there is a config file shipped with
> > > checkstyle for it. Here's a link to it:
> > > https://google.github.io/styleguide/javaguide.html
> > >
> > > My goal is just to maintain a consistent style throughout the code base
> > and
> > > avoid the review noise. Please let me know whether or not this sounds
> > > helpful.
> > >
> > > Thanks,
> > > Mike
> >
> >
> >
> > --
> > thanks
> > ashish
> >
> > Blog: http://www.ashishpaliwal.com/blog
> > My Photo Galleries: http://www.pbase.com/ashishpaliwal
> >
>


Re: Review Request 49025: FLUME-1899: Make SpoolDir work with Sub-Directories

2016-06-21 Thread Hari Shreedharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49025/#review138910
---



Since Flume 1.7+ will now only support Java 7+, why don't we consider using 
Java 7's new DirectoryStream API, rather than the much more expensive listFiles 
API? This probably will result in a huge performance boost plus far simpler 
code structure. It will require some rewrite of the code, but since we have 
tests and we expect current behavior, I would suggest doing that.

- Hari Shreedharan


On June 21, 2016, 3:32 p.m., Balázs Donát Bessenyei wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49025/
> ---
> 
> (Updated June 21, 2016, 3:32 p.m.)
> 
> 
> Review request for Flume, Denes Arvay, Mike Percy, and Attila Simon.
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> SpoolrDir currently monitors a directory and can not handle sub-directories. 
> This JIRA is to make SpoolDir able to walk down a source directory and 
> monitor new files.
> 
> 
> Diffs
> -
> 
>   
> flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java
>  d54f415 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySource.java 
> 3fe947d 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySourceConfigurationConstants.java
>  5053697 
>   
> flume-ng-core/src/test/java/org/apache/flume/source/TestSpoolDirectorySource.java
>  fe530ff 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 74d2887 
> 
> Diff: https://reviews.apache.org/r/49025/diff/
> 
> 
> Testing
> ---
> 
> Ran tests before the patch:
> # mvn clean install -DskipTests -Drat.skip=true; mvn -pl flume-ng-core 
> -Drat.skip=true test
> Tests run: 378, Failures: 0, Errors: 0, Skipped: 2
> 
> [INFO] 
> 
> [INFO] BUILD SUCCESS
> [INFO] 
> 
> [INFO] Total time: 07:55 min
> [INFO] Finished at: 2016-06-21T16:13:46+02:00
> [INFO] Final Memory: 35M/510M
> [INFO] 
> 
> 
> After patch:
> # mvn clean install -DskipTests -Drat.skip=true; mvn -pl flume-ng-core 
> -Drat.skip=true test
> Tests run: 380, Failures: 0, Errors: 0, Skipped: 2
> 
> [INFO] 
> 
> [INFO] BUILD SUCCESS
> [INFO] 
> 
> [INFO] Total time: 06:18 min
> [INFO] Finished at: 2016-06-21T17:04:17+02:00
> [INFO] Final Memory: 35M/511M
> [INFO] 
> 
> 
> Patch also includes docs
> 
> 
> Thanks,
> 
> Balázs Donát Bessenyei
> 
>



[jira] [Commented] (FLUME-2919) Upgrade the Solr version to 6.0.1

2016-06-21 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341170#comment-15341170
 ] 

Hari Shreedharan commented on FLUME-2919:
-

Doesn't Solr 6 *require* Java 8?

> Upgrade the Solr version to 6.0.1
> -
>
> Key: FLUME-2919
> URL: https://issues.apache.org/jira/browse/FLUME-2919
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Minoru Osuka
> Fix For: v1.7.0
>
> Attachments: FLUME-2919-1.patch, FLUME-2919-2.patch, FLUME-2919.patch
>
>
> Flume morphline-solr-sink is using Solr 4.3.0. Recently, Solr 6.0.1 has been 
> released. I propose to upgrade to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: make junit tests passing again

2016-06-14 Thread Hari Shreedharan
I completely agree. My best guess is that most of these are flakey tests
due to the fact that these tests are using Thread.sleep etc. Just FYI,
fixing all tests will probably take a lot of time.

On Tue, Jun 14, 2016 at 9:40 AM Attila Simon  wrote:

> Hi All,
>
> It happened to me that when I run "mvn test" target it is failing on
> my vanilla flume repo. Also when I checked an associated jenkins job
> (https://builds.apache.org/view/All/job/Flume-trunk-hbase-1/) it
> showed that test wasn't green for a while. Am I doing something
> completely wrong?
> I suspect hadoop/hbase libs are required but I couldn't find docs
> which would tell me what should a new joiner do to have a properly set
> up environment? I thought maven will resolve all of the dependencies,
> if not then what would be the recommended way to have that. Could you
> please point me to the right direction? Every help is greatly
> appreciated.
>
> If this is a known issue I would like to devote some of my time to
> clean this up as either releasing 1.7.0 would benefit from a stable
> tests or generally it would increase the healthiness of the whole
> project.
>
> I ran tests with this command after "git clone":
> mvn test --fail-at-end
>
> [INFO] Apache Flume ... SUCCESS [
> 0.365 s]
> [INFO] Flume NG SDK ... FAILURE [01:00
> min]
> [INFO] Flume NG Configuration . SKIPPED
> [INFO] Flume Auth . SKIPPED
> [INFO] Flume NG Core .. SKIPPED
> [INFO] Flume NG Sinks . SUCCESS [
> 0.010 s]
> [INFO] Flume NG HDFS Sink . SKIPPED
> [INFO] Flume NG IRC Sink .. SKIPPED
> [INFO] Flume NG Channels .. SUCCESS [
> 0.009 s]
> [INFO] Flume NG JDBC channel .. SKIPPED
> [INFO] Flume NG file-based channel  SKIPPED
> [INFO] Flume NG Spillable Memory channel .. SKIPPED
> [INFO] Flume NG Node .. SKIPPED
> [INFO] Flume NG Embedded Agent  SKIPPED
> [INFO] Flume NG HBase Sink  SKIPPED
> [INFO] Flume NG ElasticSearch Sink  SKIPPED
> [INFO] Flume NG Morphline Solr Sink ... SKIPPED
> [INFO] Flume Kafka Sink ... SKIPPED
> [INFO] Flume NG Kite Dataset Sink . SKIPPED
> [INFO] Flume NG Hive Sink . SKIPPED
> [INFO] Flume Sources .. SUCCESS [
> 0.010 s]
> [INFO] Flume Scribe Source  SKIPPED
> [INFO] Flume JMS Source ... SKIPPED
> [INFO] Flume Twitter Source ... SKIPPED
> [INFO] Flume Kafka Source . SKIPPED
> [INFO] Flume Taildir Source ... SKIPPED
> [INFO] flume-kafka-channel  SKIPPED
> [INFO] Flume legacy Sources ... SUCCESS [
> 0.009 s]
> [INFO] Flume legacy Avro source ... SKIPPED
> [INFO] Flume legacy Thrift Source . SKIPPED
> [INFO] Flume NG Clients ... SUCCESS [
> 0.008 s]
> [INFO] Flume NG Log4j Appender  SKIPPED
> [INFO] Flume NG Tools . SKIPPED
> [INFO] Flume NG distribution .. SKIPPED
> [INFO] Flume NG Integration Tests . SKIPPED
>
> Environment:
> OS X El Capitan
> Java(TM) SE Runtime Environment (build 1.7.0_79-b15), apache flume trunk
> set | grep -i -Ee flume\|hadoop\|hbase  => nothing
>
> Cheers,
> Attila
>


Re: Flume 1.7.0 release

2016-06-13 Thread Hari Shreedharan
Sure. I can help with the release. My time to work on this would be
limited, so it might take me a day or two between updates.
On Mon, Jun 13, 2016 at 9:29 PM Saikat Kanjilal <sxk1...@hotmail.com> wrote:

> Works for me,  let me know how I can get involved/help.
>
> Sent from my iPhone
>
> > On Jun 13, 2016, at 9:24 PM, Lior Zeno <liorz...@gmail.com> wrote:
> >
> > Saikat, I still think that we should discuss it here. We can talk in
> > private about specific issues if you'd like.
> >
> > I'll open an umbrella issue for this release. It will include all
> necessary
> > steps, e.g. keys and docs, but also jira cleaning and reviewing all
> pending
> > patches we have. I know it's a lot of work, but it has to be done.
> >
> > Hari, thank you. Would you please mentor?
> >> On Jun 14, 2016 3:21 AM, "Hari Shreedharan" <hshreedha...@apache.org>
> wrote:
> >>
> >> Only committers can commit to the repo. In the past, when release
> managers
> >> were not committers, one committer mentored the release manager. For the
> >> release, patches would be posted as usual on a release and the committer
> >> would commit the patches. Basically follow all steps as is, and then
> just
> >> post patches to jiras. See the umbrella jira for Flume 1.6 release:
> >> https://issues.apache.org/jira/browse/FLUME-2674
> >>
> >> Lior - I have added you as a contributor on jira. This should allow you
> to
> >> create jira tickets and assign them to yourself.
> >>
> >> On Mon, Jun 13, 2016 at 4:31 PM Saikat Kanjilal <sxk1...@hotmail.com>
> >> wrote:
> >>
> >>> Should we have a google hangout session to figure this out?
> >>>
> >>>> From: liorz...@gmail.com
> >>>> Date: Mon, 13 Jun 2016 22:44:29 +0300
> >>>> Subject: Re: Flume 1.7.0 release
> >>>> To: dev@flume.apache.org
> >>>>
> >>>> Guys, thank you for your support and motivation. There are still two
> >> big
> >>>> issues we need to figure out before we can proceed:
> >>>> (a) Who can commit to the repo?
> >>>> (b) Who has JIRA permissions?
> >>>>
> >>>> Hari, I'll be happy to run this release, if that's fine by everyone.
> >>>>
> >>>>> On Mon, Jun 13, 2016 at 6:52 AM, Lior Zeno <liorz...@gmail.com>
> wrote:
> >>>>>
> >>>>> Hi Shiwei,
> >>>>>
> >>>>> Please see this:
> >>>>> https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute.
> >>>>> In a nutshell, issues are manages on JIRA, and code contributions are
> >>> done
> >>>>> via patches (not pull requests).
> >>>>> Regarding a release plan, that is true, we will need to discuss a
> >>> roadmap
> >>>>> and manage more carefully our release plans. However, since there was
> >>> not
> >>>>> any new release in the past year, the next version of Flume will have
> >>>>> plenty of new features and bug fixes :)
> >>>>>
> >>>>> On Mon, Jun 13, 2016 at 6:26 AM, shiwei qin <qinshiw...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> I'd like to be able to release the new version as soon as possible.
> >>> Also I
> >>>>>> never see release plan. In fact, I most want is the ability to flex
> >>>>>> developers migrate to github, There are a lot of people do not
> >> really
> >>> know
> >>>>>> how to contribute code to flume, like me.
> >>>>>>
> >>>>>> 2016-06-13 6:43 GMT+08:00 Attila Simon <s...@cloudera.com>:
> >>>>>>
> >>>>>>> Hi All,
> >>>>>>>
> >>>>>>> I would love to hear more and get involved.
> >>>>>>>
> >>>>>>> Just another enthusiastic developer candidate,
> >>>>>>> Attila
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sun, Jun 12, 2016 at 7:09 AM, Saikat Kanjilal <
> >>> sxk1...@hotmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Sure will do, meanwhile should we get together over google
> >>> hangout to
> >>>>>>>

Re: Flume 1.7.0 release

2016-06-13 Thread Hari Shreedharan
; > > the plugin.
> > >> > > >
> > >> > > > Sent from my iPhone
> > >> > > >
> > >> > > > > On Jun 12, 2016, at 2:03 AM, Lior Zeno <liorz...@gmail.com>
> > >> wrote:
> > >> > > > >
> > >> > > > > This is absolutely true. The availability of the project's
> > >> committers
> > >> > > is
> > >> > > > > very low, which leads to cases like yours where people submit
> > >> patches
> > >> > > and
> > >> > > > > never get a response, even after a year of waiting. I believe
> > >> that we
> > >> > > have
> > >> > > > > to be more active and available, since low availability
> > >> discourages
> > >> > > > > contributors.
> > >> > > > > I think that such discussions should take place either here
> or on
> > >> > JIRA,
> > >> > > > > since it does not limit the discussion to a small group of
> people,
> > >> > but
> > >> > > > > instead allows the community to be a part of the project's
> future.
> > >> > > > >
> > >> > > > > On Sun, Jun 12, 2016 at 2:09 AM, Saikat Kanjilal <
> > >> > sxk1...@hotmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > >> I would like to help out managing Jira's but never heard back
> > >> from
> > >> > the
> > >> > > > >> community about the graph sink that I've been working on.
> Does it
> > >> > make
> > >> > > > >> sense to do a google hangout to discuss roadmap/upcoming
> > >> features?
> > >> > > > >>
> > >> > > > >> Sent from my iPhone
> > >> > > > >>
> > >> > > > >>> On Jun 11, 2016, at 10:19 AM, Lior Zeno <liorz...@gmail.com
> >
> > >> > wrote:
> > >> > > > >>>
> > >> > > > >>> Let's first examine our JIRA issues. We have fixed issues
> with
> > >> an
> > >> > > empty
> > >> > > > >>> fixVersion. In addition, we still have unresolved issues
> with
> > >> > > > >>> fixVersion=v1.7.0. Let's deal with these first. I would do
> it
> > >> > myself,
> > >> > > > >> but I
> > >> > > > >>> don't have the appropriate permissions for that.
> > >> > > > >>>
> > >> > > > >>> On Sat, Jun 11, 2016 at 8:10 PM, Hari Shreedharan <
> > >> > > > >> hshreedha...@apache.org>
> > >> > > > >>> wrote:
> > >> > > > >>>
> > >> > > > >>>> Sound good to me. If we have a volunteer to run the
> release I
> > >> will
> > >> > > > >> gladly
> > >> > > > >>>> help out.
> > >> > > > >>>>> On Sat, Jun 11, 2016 at 6:54 AM Lior Zeno <
> liorz...@gmail.com
> > >> >
> > >> > > wrote:
> > >> > > > >>>>>
> > >> > > > >>>>> Anybody?
> > >> > > > >>>>>
> > >> > > > >>>>>> On Thu, Jun 9, 2016 at 7:24 PM, Lior Zeno <
> > >> liorz...@gmail.com>
> > >> > > wrote:
> > >> > > > >>>>>>
> > >> > > > >>>>>> Hi guys,
> > >> > > > >>>>>>
> > >> > > > >>>>>> I think we should work together towards a new release.
> It has
> > >> > > been a
> > >> > > > >>>> year
> > >> > > > >>>>>> since the last release, and there are many new features
> that
> > >> > were
> > >> > > > >> added
> > >> > > > >>>>> in
> > >> > > > >>>>>> this year.
> > >> > > > >>>>>> What do you think?
> > >> > > > >>>>>>
> > >> > > > >>>>>> Lior
> > >> > > > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
>


Re: Enforce coding conventions at compilation time

2016-06-13 Thread Hari Shreedharan
I agree. Checkstyle is pretty useful - we should add it.

On Sat, Jun 11, 2016 at 7:50 AM Lior Zeno  wrote:

> Hi guys, we should make the reviewing process easier and more focused on
> correctness rather than style issues. I suggest enforcing our code style (
> https://cwiki.apache.org/confluence/display/FLUME/Code+Formatting) at
> compile time using the maven checkstyle plugin (
> https://maven.apache.org/plugins/maven-checkstyle-plugin/). This will make
> the reviewing process easier, and will make sure that all committed code is
> strictly following our code style.
> What do you think?
> Thanks
>


Re: Flume 1.7.0 release

2016-06-11 Thread Hari Shreedharan
Sound good to me. If we have a volunteer to run the release I will gladly
help out.
On Sat, Jun 11, 2016 at 6:54 AM Lior Zeno  wrote:

> Anybody?
>
> On Thu, Jun 9, 2016 at 7:24 PM, Lior Zeno  wrote:
>
> > Hi guys,
> >
> > I think we should work together towards a new release. It has been a year
> > since the last release, and there are many new features that were added
> in
> > this year.
> > What do you think?
> >
> > Lior
> >
>


[jira] [Commented] (FLUME-2922) HDFSSequenceFile Should Sync Writer

2016-06-09 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323040#comment-15323040
 ] 

Hari Shreedharan commented on FLUME-2922:
-

I agree, we should flush the hdfs sequence file writer. 

As for the hflush vs hsync: I don't know of any real-world HDFS-based system 
that uses hsync, because the hsync implementation is known to be not-optimized 
and hits performance pretty drastically. hflush is what most systems use - as 
it flushes to datanode memory. There is data loss only if all 3 datanodes go 
down before namenode detects under-replication and replicates the block - which 
is really unlikely. I believe the performance cost may not be worth it.

> HDFSSequenceFile Should Sync Writer
> ---
>
> Key: FLUME-2922
> URL: https://issues.apache.org/jira/browse/FLUME-2922
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Kevin Conaway
>Priority: Critical
>
> There is a possibility of losing data with the current HDFS sequence file 
> writer.
> Internally, the `SequenceFile.Writer` buffers data and periodically syncs it 
> to the underlying output stream.  The mechanism for doing this is dependent 
> on whether you are using compression or not but in both scenarios, the 
> key/values are appended to an internal buffer and only flushed to disk after 
> the buffer reaches a certain size.
> Thus it is quite possible for Flume to lose messages if the agent crashes, or 
> is stopped, before the internal buffer is flushed to disk.
> The correct action is to force the writer to sync its internal buffers to the 
> underlying `FSDataOutputStream` first before calling hflush/sync.
> Additionally, I believe we should be calling hsync instead of hflush.  Its my 
> understanding writes with hsync should be more durable which I believe are 
> the semantics we want here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2910) AsyncHBaseSink - Failure callbacks should log the exception that caused them

2016-05-19 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291682#comment-15291682
 ] 

Hari Shreedharan commented on FLUME-2910:
-

Does the Asynchbase API actually send the exception back as the argument of the 
call method? Do you have the javadoc that specifies this available? 

> AsyncHBaseSink - Failure callbacks should log the exception that caused them
> 
>
> Key: FLUME-2910
> URL: https://issues.apache.org/jira/browse/FLUME-2910
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Attachments: FLUME-2910.patch
>
>
> Failure callbacks in the AsyncHBaseSink currently do not log the exception 
> that causes them to be called, this should be fixed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: How to contribute?

2016-05-12 Thread Hari Shreedharan
We don't really need a vote, I think. We just need someone to work on
getting merge scripts in place (we could copy it from one of the
existing projects, like Spark).

On Thu, May 12, 2016 at 7:06 AM, Jeff Holoman <jholo...@cloudera.com> wrote:
> Speaking of this, didn't we decide to moe to PR's?
>
> Do we need a formal vote for this?
>
> Thanks
>
> Jeff
>
> On Wed, May 11, 2016 at 5:16 PM, Hari Shreedharan <hshreedha...@apache.org>
> wrote:
>
>> Hi Lior,
>>
>> Flume does not use PRs yet for reviews. Please attach the patch to the
>> jira and someone will get to it. Optionally, you can also submit the
>> patch on review board for review (though you must attach the patch on
>> the jira as well).
>>
>> See this as well:
>> https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute
>>
>> HTH.
>>
>>
>>
>> On Wed, May 11, 2016 at 8:48 AM, Lior Zeno <liorz...@gmail.com> wrote:
>> > Hi All,
>> >
>> > I would like to make code contributions to the project, however I'm not
>> > sure how.
>> > I have sent a PR (FLUME-2726) on github a few days ago, but never got a
>> > reply.
>> >
>> > Please let me know how to properly get started.
>> >
>> > Thanks
>>
>
>
>
> --
> Jeff Holoman


Re: How to contribute?

2016-05-11 Thread Hari Shreedharan
Hi Lior,

Flume does not use PRs yet for reviews. Please attach the patch to the
jira and someone will get to it. Optionally, you can also submit the
patch on review board for review (though you must attach the patch on
the jira as well).

See this as well:
https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute

HTH.



On Wed, May 11, 2016 at 8:48 AM, Lior Zeno  wrote:
> Hi All,
>
> I would like to make code contributions to the project, however I'm not
> sure how.
> I have sent a PR (FLUME-2726) on github a few days ago, but never got a
> reply.
>
> Please let me know how to properly get started.
>
> Thanks


Re: Review Request 47098: FLUME-2620 File channel throws NullPointerException if a header value is null

2016-05-09 Thread Hari Shreedharan


> On May 8, 2016, 9:16 p.m., Jarek Cecho wrote:
> > flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSourceHandler.java,
> >  line 44
> > 
> >
> > This will be a backward in-compatible change. Can we perhaps convert 
> > this interface to abstract class, move handling of the nullReplacement 
> > here, so that we will break the contract only once?
> > 
> > E.g. that we won't have to break it somewhere in the future when we 
> > will need to add yet another argument.

Unfortunately, changing this to an abstract class does not solve the issue here 
- since that is a bytecode change and breaks compat (code too - implements vs 
extends). One way of fixing this is to add an abstract class that inherits this 
one, and have an instance of check in the Source itself, and call this method 
only if the handler is an instance of the abstract class.


- Hari


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47098/#review132183
---


On May 9, 2016, 4:57 a.m., neerja khattar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47098/
> ---
> 
> (Updated May 9, 2016, 4:57 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> The issue is when the header value is null it throws null pointer exception 
> and flume stops processing further events.
> For example:
> [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : null
>  },
>   "body" : "random_body"
>   }]
>   
>   The solution to fix this is:
>   
>   1. If the header has a null value in the json, flume will replace it with a 
> replacement string.
>   2. The default value for a replacement string is an empty string.
>   3. To overwrite default string, set "handler.nullReplacementHeader" 
> property in flume config.
> 
> 
> Diffs
> -
> 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSource.java 
> b520b03 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSourceConfigurationConstants.java
>  86caf7d 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSourceHandler.java
>  726bf0c 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/JSONHandler.java 
> 197f66a 
>   
> flume-ng-core/src/test/java/org/apache/flume/source/http/TestJSONHandler.java 
> 455781c 
>   
> flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java
>  e84dec1 
> 
> Diff: https://reviews.apache.org/r/47098/diff/
> 
> 
> Testing
> ---
> 
> The following are the test cases:
> 
> 1. Header has null value in json and handler.nullReplacementHeader is not set 
> in flume config. The default value will be used to replace null.  
> 
> [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : null
>  },
>   "body" : "random_body"
>   }]
>   
>   Output in hdfs : {timestamp=434324343, host=} random_body  
>   
>   2. Header is not null in json and handler.nullReplacementHeader is not set 
> in flume config. The replacement implementation doesnt come in to 
> consideration.
>
>   [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : 1
>  },
>   "body" : "random_body"
>   }]
>   Output in hdfs : {timestamp=434324343, host=1} random_body 
>   
>   3. Header has null value in json and handler.nullReplacementHeader=abc is 
> set in flume config. The null value in header will be replaced by abc.
> 
>   
>   [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : null
>  },
>   "body" : "random_body"
>   }]
>   
>  
>   Output in hdfs {timestamp=434324343, host=abc} random_body 
>   
>   4. Header has null value in json and handler.nullReplacementHeader=1 is set 
> in flume config. The null value in header will be replaced by 1 as a string .
>   
>   [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : null
>  },
>   "body" : "random_body"
>   }]
>   
>  
>   Output in hdfs: {timestamp=434324343, host=1} random_body
>   
>   5. Header is not null in json and handler.nullReplacementHeader is also set 
> in flume config. The replacement implementation doesnt come in to 
> consideration.
>
>   [{
>   "headers" : {
>  "timestamp" : "434324343",
>  "host" : 1
>  },
>   "body" : "random_body"
>   }]
>   Output in hdfs : {timestamp=434324343, host=1} random_body
> 
> 
> File Attachments
> 
> 
> flume-2620
>   
> 

Release 1.7

2016-05-06 Thread Hari Shreedharan
Hi,

It has been almost a year since our 1.6 release. Over the year, we
have added a lot of new features, and fixes. I know most vendors have
already pulled these into their distros, but the official ASF release
is lagging way behind.

What do you think about a 1.7 release soon? Any volunteers to drive the release?


Thanks!
Hari


Re: Github integration

2016-04-05 Thread Hari Shreedharan
https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-11600

On Thu, Mar 24, 2016 at 10:12 AM, Hari Shreedharan
<hshreedha...@apache.org> wrote:
> No, github is not the primary scm. ASF git repos still are. But ASF
> does allow us to use github for reviews and such. Parquet, Spark etc
> have done it.
>
> On Thu, Mar 24, 2016 at 9:57 AM, Ralph Goers <ralph.go...@dslextreme.com> 
> wrote:
>> The ASF allows github to be the primary scm?  That is news to me.  My 
>> understanding is that projects have to use the ASFs git repo but you can 
>> pull changes from the github mirror.
>>
>> Ralph
>>
>>> On Mar 19, 2016, at 12:14 PM, Hari Shreedharan <hshreedha...@apache.org> 
>>> wrote:
>>>
>>> Hi,
>>>
>>> I have worked for a while on Spark recently, and like using github for
>>> scm. While not the best tool for code reviews, it certainly is better
>>> than using patches on jiras. We already get a lot of review requests
>>> as Pull Requests anyway. I'd like some community feedback on this.
>>>
>>> I think we'd need a vote before we can get it done.
>>>
>>> Thanks!
>>> Hari
>>>
>>
>>


Re: Github integration

2016-03-24 Thread Hari Shreedharan
No, github is not the primary scm. ASF git repos still are. But ASF
does allow us to use github for reviews and such. Parquet, Spark etc
have done it.

On Thu, Mar 24, 2016 at 9:57 AM, Ralph Goers <ralph.go...@dslextreme.com> wrote:
> The ASF allows github to be the primary scm?  That is news to me.  My 
> understanding is that projects have to use the ASFs git repo but you can pull 
> changes from the github mirror.
>
> Ralph
>
>> On Mar 19, 2016, at 12:14 PM, Hari Shreedharan <hshreedha...@apache.org> 
>> wrote:
>>
>> Hi,
>>
>> I have worked for a while on Spark recently, and like using github for
>> scm. While not the best tool for code reviews, it certainly is better
>> than using patches on jiras. We already get a lot of review requests
>> as Pull Requests anyway. I'd like some community feedback on this.
>>
>> I think we'd need a vote before we can get it done.
>>
>> Thanks!
>> Hari
>>
>
>


[jira] [Commented] (FLUME-2823) Flume-Kafka-Channel with new APIs

2016-03-20 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203491#comment-15203491
 ] 

Hari Shreedharan commented on FLUME-2823:
-

This looks good, but the tests are hanging. There is at least 1 incomplete test 
as well. Could you please fix that as well?

> Flume-Kafka-Channel with new APIs
> -
>
> Key: FLUME-2823
> URL: https://issues.apache.org/jira/browse/FLUME-2823
> Project: Flume
>  Issue Type: Sub-task
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
> Attachments: FLUME-2823v4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Github integration

2016-03-19 Thread Hari Shreedharan
Hi,

I have worked for a while on Spark recently, and like using github for
scm. While not the best tool for code reviews, it certainly is better
than using patches on jiras. We already get a lot of review requests
as Pull Requests anyway. I'd like some community feedback on this.

I think we'd need a vote before we can get it done.

Thanks!
Hari


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-03-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186255#comment-15186255
 ] 

Hari Shreedharan commented on FLUME-2889:
-

[~tmgstev] - sounds good. Like [~roshan_naik] said, go ahead and reuse this 
jira.

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2891) Revert FLUME-2712 and FLUME-2886

2016-03-08 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2891:

Attachment: FLUME-2891.patch

> Revert FLUME-2712 and FLUME-2886
> 
>
> Key: FLUME-2891
> URL: https://issues.apache.org/jira/browse/FLUME-2891
> Project: Flume
>  Issue Type: Bug
>    Reporter: Hari Shreedharan
>    Assignee: Hari Shreedharan
> Attachments: FLUME-2891.patch
>
>
> FLUME-2712 can be fixed by simply setting keep-alive to 0. I think it added 
> additional complexity, which we can probably avoid. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2891) Revert FLUME-2712 and FLUME-2886

2016-03-08 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created FLUME-2891:
---

 Summary: Revert FLUME-2712 and FLUME-2886
 Key: FLUME-2891
 URL: https://issues.apache.org/jira/browse/FLUME-2891
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan


FLUME-2712 can be fixed by simply setting keep-alive to 0. I think it added 
additional complexity, which we can probably avoid. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-03-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185798#comment-15185798
 ] 

Hari Shreedharan commented on FLUME-2889:
-

[~roshan_naik] - Mind committing this?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170042#comment-15170042
 ] 

Hari Shreedharan commented on FLUME-2889:
-

Sounds good, [~roshan_naik] - lets revert the last one and commit this one. 
This one LGTM, so +1.

I'd still wait for [~tmgstev] to take a look, since he caught the issue 
earlier. 

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169986#comment-15169986
 ] 

Hari Shreedharan commented on FLUME-2889:
-

[~tmgstev] Thanks for the patch. Can you base the patch on current trunk, 
rather than before the last commit, so I can just directly commit this?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169511#comment-15169511
 ] 

Hari Shreedharan commented on FLUME-2889:
-

Ah, I get your concern. Since we never do never set the year in the date 
object, it should be fixed.minusYears and fixed.plusYears...

[~roshan_naik] - I think that does make sense. I think we need to fix this, 
correct?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-25 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168114#comment-15168114
 ] 

Hari Shreedharan commented on FLUME-2889:
-

+1. LGTM. 

[~roshan_naik] Please go ahead and commit it, else I will commit it 
today/tomorrow

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Leap year : Date parsing in SyslogParser used by MultiportSyslogTCP source

2016-02-25 Thread Hari Shreedharan
I am looking at your patch now.

On Thu, Feb 25, 2016 at 1:15 PM, Roshan Naik  wrote:
> Recently found a DateTime computation in issue FLUME-2889 which might affect 
> dates when leap year is involved.
>
> Given that 2016 is a leap year, I am trying to assess the impact of this bug.
>
> Method SyslogParser.parseRfc3164Time()  appears to be actually adjusting the 
> year in the date that it is parsing.. It adds to subtracts 1 year to the 
> parsed date based on the current system datetime.
>
> Questions:
> 1 - Why is it trying to modify the year on the parsed date instead of just 
> using It as is ?
> 2 - On a flume agent that retains this bug... Intuitively it seems like this 
> will likely cause incorrect dates in data. Leading to messed up data. Would  
> that  be right ?
>
> -roshan


[jira] [Updated] (FLUME-2886) Optional Channels can cause OOMs

2016-02-22 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2886:

Attachment: FLUME-2886.patch

> Optional Channels can cause OOMs
> 
>
> Key: FLUME-2886
> URL: https://issues.apache.org/jira/browse/FLUME-2886
> Project: Flume
>  Issue Type: Bug
>    Reporter: Hari Shreedharan
>    Assignee: Hari Shreedharan
> Attachments: FLUME-2886.patch
>
>
> If an optional channel is full, the queue backing the executor that is 
> asynchronously submitting the events to the channel can grow indefinitely in 
> size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2886) Optional Channels can cause OOMs

2016-02-22 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2886:

Attachment: (was: FLUME-2886.patch)

> Optional Channels can cause OOMs
> 
>
> Key: FLUME-2886
> URL: https://issues.apache.org/jira/browse/FLUME-2886
> Project: Flume
>  Issue Type: Bug
>    Reporter: Hari Shreedharan
>    Assignee: Hari Shreedharan
> Attachments: FLUME-2886.patch
>
>
> If an optional channel is full, the queue backing the executor that is 
> asynchronously submitting the events to the channel can grow indefinitely in 
> size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2886) Optional Channels can cause OOMs

2016-02-22 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2886:

Attachment: FLUME-2886.patch

This fixes the issue and adds a unit test

> Optional Channels can cause OOMs
> 
>
> Key: FLUME-2886
> URL: https://issues.apache.org/jira/browse/FLUME-2886
> Project: Flume
>  Issue Type: Bug
>    Reporter: Hari Shreedharan
>    Assignee: Hari Shreedharan
> Attachments: FLUME-2886.patch
>
>
> If an optional channel is full, the queue backing the executor that is 
> asynchronously submitting the events to the channel can grow indefinitely in 
> size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2886) Optional Channels can cause OOMs

2016-02-22 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created FLUME-2886:
---

 Summary: Optional Channels can cause OOMs
 Key: FLUME-2886
 URL: https://issues.apache.org/jira/browse/FLUME-2886
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan


If an optional channel is full, the queue backing the executor that is 
asynchronously submitting the events to the channel can grow indefinitely in 
size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Reviews

2016-02-08 Thread Hari Shreedharan
Hi Ralph,

Sorry about the delay - I have been out for a bit and will look at bunch of
pending reviews this week. I am looking at yours right now.

On Mon, Feb 8, 2016 at 8:00 AM Ralph Goers 
wrote:

> I submitted a review for Flume 2875 a week ago. I have updated it a few
> times since then, the last being on Feb 5. No  one has apparently looked at
> the Review.  As you might know, I am not a fan of RTC for the exact reason
> that it slows everything down with no assurance of any value being added.
>
> The policy on the wiki says nothing about review requests that are
> ignored.  I would propose that if a review gets no feedback within 72 hours
> then the committer is free to commit their change.  FWIW, I plan to do
> exactly that tonight or tomorrow as time permits.
>
> Ralph
>
-- 
Thanks,
Hari


[jira] [Commented] (FLUME-2875) Allow RollingFileSink to specify a file prefix and a file extension.

2016-02-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138160#comment-15138160
 ] 

Hari Shreedharan commented on FLUME-2875:
-

+1. Running tests and committing

> Allow RollingFileSink to specify a file prefix and a file extension.
> 
>
> Key: FLUME-2875
> URL: https://issues.apache.org/jira/browse/FLUME-2875
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Ralph Goers
>Assignee: Ralph Goers
> Fix For: v1.7.0
>
> Attachments: FLUME-2875.patch
>
>
> Currently the RollingFileSink is hard-wired to use a specific PathManager 
> that creates file based on a timestamp and and incrementing value. User's 
> should have the ability to add a prefix to the file name and to add a file 
> extension to properly identify the type of data in the file.  In addition, 
> user's should have the ability to provide their own PathManager 
> implementation to allow them to construct the file names and locations 
> however they desire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (FLUME-2873) The issue should be deleted. I do not know how to delete it.

2016-02-02 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan deleted FLUME-2873:



> The issue should be deleted. I do not know how to delete it.
> 
>
> Key: FLUME-2873
> URL: https://issues.apache.org/jira/browse/FLUME-2873
> Project: Flume
>  Issue Type: New Feature
>Reporter: Li Ye
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Flume build hangs and fails in Kafka channel

2016-02-01 Thread Hari Shreedharan
I am not sure - I have not really hit this before. I think you could
just @Ignore the test for now - I will take a look at it this week
sometime and see what the issue is.

On Mon, Feb 1, 2016 at 11:35 AM, Ralph Goers <ralph.go...@dslextreme.com> wrote:
> Any suggestions on how to debug it?  I am adding some new components for my 
> own use that I will probably give back but I want to make sure I am not 
> breaking anything, so I really need a clean build before I get started.
>
> Ralph
>
>> On Feb 1, 2016, at 12:17 PM, Hari Shreedharan <hshreedha...@apache.org> 
>> wrote:
>>
>> Looks like the test is just waiting for something to happen - looks
>> like the test is not working as expected. I don't see this, but this
>> indicates the test is flakey.
>>
>> On Mon, Feb 1, 2016 at 8:22 AM, Ralph Goers <ralph.go...@dslextreme.com> 
>> wrote:
>>> I am trying to run the Flume build on my MacBook Pro and it is hanging 
>>> running the tests for the Kafka file channel. A portion of the stack trace 
>>> is below. After 15 minutes of waiting the build just terminates with no 
>>> errors - it just emits
>>>
>>> [ERROR] Failed to execute goal 
>>> org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) 
>>> on project flume-kafka-channel: ExecutionException; nested exception is 
>>> java.util.concurrent.ExecutionException: java.lang.RuntimeException: The 
>>> forked VM terminated without saying properly goodbye. VM crash or 
>>> System.exit called ? -> [Help 1]
>>>
>>> Any idea why this might be happening?
>>>
>>> Ralph
>>>
>>>   java.lang.Thread.State: WAITING (parking)
>>>at sun.misc.Unsafe.park(Native Method)
>>>- parking to wait for  <0x0007004286a8> (a 
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>>>at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>>>at 
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>>>at 
>>> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>>>at 
>>> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
>>>at 
>>> org.apache.flume.channel.kafka.TestKafkaChannel.wait(TestKafkaChannel.java:413)
>>>at 
>>> org.apache.flume.channel.kafka.TestKafkaChannel.writeAndVerify(TestKafkaChannel.java:291)
>>>at 
>>> org.apache.flume.channel.kafka.TestKafkaChannel.doTestSuccessRollback(TestKafkaChannel.java:104)
>>>at 
>>> org.apache.flume.channel.kafka.TestKafkaChannel.testSuccessInterleave(TestKafkaChannel.java:88)
>>
>
>


Re: Flume build hangs and fails in Kafka channel

2016-02-01 Thread Hari Shreedharan
Looks like the test is just waiting for something to happen - looks
like the test is not working as expected. I don't see this, but this
indicates the test is flakey.

On Mon, Feb 1, 2016 at 8:22 AM, Ralph Goers  wrote:
> I am trying to run the Flume build on my MacBook Pro and it is hanging 
> running the tests for the Kafka file channel. A portion of the stack trace is 
> below. After 15 minutes of waiting the build just terminates with no errors - 
> it just emits
>
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
> project flume-kafka-channel: ExecutionException; nested exception is 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: The 
> forked VM terminated without saying properly goodbye. VM crash or System.exit 
> called ? -> [Help 1]
>
> Any idea why this might be happening?
>
> Ralph
>
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007004286a8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
> at 
> org.apache.flume.channel.kafka.TestKafkaChannel.wait(TestKafkaChannel.java:413)
> at 
> org.apache.flume.channel.kafka.TestKafkaChannel.writeAndVerify(TestKafkaChannel.java:291)
> at 
> org.apache.flume.channel.kafka.TestKafkaChannel.doTestSuccessRollback(TestKafkaChannel.java:104)
> at 
> org.apache.flume.channel.kafka.TestKafkaChannel.testSuccessInterleave(TestKafkaChannel.java:88)


Re: FLUME-2719

2015-12-08 Thread Hari Shreedharan
I am getting DNS errors while trying to get to ASF jira right now. I will
take a look later.


Thanks,
Hari

On Tue, Dec 8, 2015 at 7:24 PM, Gonzalo Herreros 
wrote:

> Hi,
>
> Could you have a look at this issue
> https://issues.apache.org/jira/browse/FLUME-2719 and provide some
> feedback/review on the patch?
>
> Thanks,
> Gonzalo
>


[jira] [Commented] (FLUME-2850) FileChannel should allow take operation when minimumRequiredSpace runs out

2015-12-07 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046213#comment-15046213
 ] 

Hari Shreedharan commented on FLUME-2850:
-

In theory, I agree. But remember that takes also do take up disk space. It 
would be interesting to add an additional config, but the implementation is 
likely to be non-trivial.

> FileChannel should allow take operation when minimumRequiredSpace runs out
> --
>
> Key: FLUME-2850
> URL: https://issues.apache.org/jira/browse/FLUME-2850
> Project: Flume
>  Issue Type: Bug
>  Components: File Channel
>Affects Versions: v1.6.0
>Reporter: Tycho Lamerigts
>
> In the status quo, when minimumRequiredSpace runs out, FileChannel closes 
> itself and thereby prevents flume from ever self-recovering. Instead, manual 
> action is needed to free up some disk space.  If FileChannel would only block 
> put operations and would still allow take operations then flume could 
> self-recover, assuming some sinks will eventually succeed in draining the 
> channel. Also, it would lead to fewer dropped events overall.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2841) Upgrade commons-collections to 3.2.2

2015-11-18 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created FLUME-2841:
---

 Summary: Upgrade commons-collections to 3.2.2
 Key: FLUME-2841
 URL: https://issues.apache.org/jira/browse/FLUME-2841
 Project: Flume
  Issue Type: Bug
  Components: Build
Affects Versions: v1.6.0
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan


Refer: COLLECTIONS-580



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2841) Upgrade commons-collections to 3.2.2

2015-11-18 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2841:

Attachment: FLUME-2841.patch

Trivial patch

> Upgrade commons-collections to 3.2.2
> 
>
> Key: FLUME-2841
> URL: https://issues.apache.org/jira/browse/FLUME-2841
> Project: Flume
>  Issue Type: Bug
>  Components: Build
>Affects Versions: v1.6.0
>Reporter: Hari Shreedharan
>    Assignee: Hari Shreedharan
> Attachments: FLUME-2841.patch
>
>
> Refer: COLLECTIONS-580



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Potential Gerrit support in review/commit flow

2015-11-06 Thread Hari Shreedharan
+1. I had looked into this a while back, and could not make much progress
as ASF I believe required us to host the gerrit instance, but was not ok
with giving us control over commit access (gerrit user needs to be able to
commit to the repo or something). Anyway, if you can do it, I will be
interested!


Thanks,
Hari

On Fri, Nov 6, 2015 at 10:56 AM, Jarek Jarcec Cecho 
wrote:

> I like the gerrit flow a lot. I can’t speak for the flume community, but I
> would be in favor of that :)
>
> You might consider sending similar email to Sqoop community where I would
> support gerrit as well.
>
> Jarcec
>
> > On Nov 6, 2015, at 10:27 AM, Zhe Zhang  wrote:
> >
> > Hi Flume contributors,
> >
> > The Hadoop community is considering adding Gerrit as a review / commit
> > tool. Since this will require support from the Apache Infra team, it
> makes
> > more sense if multiple projects can benefit from the effort.
> >
> > The main benefit of Gerrit over ReviewBoard is better integration with
> git
> > and Jenkins. A Gerrit review request is created through a simple "git
> push"
> > instead of manually creating and uploading a diff file. Conflicts
> detection
> > and rebase can be done on the review UI as well. When the programmed
> commit
> > criteria are met (e.g. a code review +1 and a Jenkins verification),
> > committing can also be done with a single button click, or even
> > automatically.
> >
> > The main benefit of Gerrit over Github pull requests is the rebase
> workflow
> > (rather than git merge), which avoids merge commits. The rebase workflow
> > also enables a clear interdiff view, rather than reviewing every patch
> rev
> > from scratch.
> >
> > This also just augments instead of replacing the current review / commit
> > flow. Every task will still start as a JIRA. Review comments can be made
> on
> > both JIRA and Gerrit and will be bi-directionally mirrored. Patches can
> > also be directly committed through git command line (Gerrit will
> recognize
> > a direct commit and close the review request as long as a simple git hook
> > is installed:
> > https://gerrit.googlecode.com/svn/documentation/2.0/user-changeid.html).
> >
> > I wonder if the Flume community would be interested in moving on this
> > direction as well. Any feedback is much appreciated.
> >
> > Thanks,
> > Zhe Zhang
>
>


[jira] [Commented] (FLUME-2712) Optional channel errors slows down the Source to Main channel event rate

2015-10-30 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983371#comment-14983371
 ] 

Hari Shreedharan commented on FLUME-2712:
-

+1. LGTM. Running tests and committing

> Optional channel errors slows down the Source to Main channel event rate
> 
>
> Key: FLUME-2712
> URL: https://issues.apache.org/jira/browse/FLUME-2712
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2712-1.patch, FLUME-2712-2.patch, 
> FLUME-2712.patch, FLUME-2712.patch
>
>
> When we have a source configured to deliver events to a main channel and an 
> optional channel, and if the delivery to optional channel fails, this 
> significantly slows down the rate at which events are delivered to the main 
> channel by the source.
> We need to evaluate async means of delivering events to the optional channel 
> and isolate the errors happening in optional channel from the delivery to the 
> main channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2712) Optional channel errors slows down the Source to Main channel event rate

2015-10-29 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981879#comment-14981879
 ] 

Hari Shreedharan commented on FLUME-2712:
-

Johny,

This looks good in general. Some minor comments:
* I think the {{executeRequiredChannelTransaction}} can be 
{{executeChannelTransaction}} and can be used for both. Just take an additional 
flag as an argument and throw the {{ChannelException}} based on that. Otherwise 
the run method in {{OptionalChannelTransactionExecutor}} are exactly the same
* {{OptionalChannelTransactionExecutor}} should be named 
{{OptionalChannelTransactionThread}} or something like that - it really is not 
an executor.
* {{List events = new ArrayList()}} -> {{List events = new 
ArrayList(1)}} in {{processEvent}} method.

> Optional channel errors slows down the Source to Main channel event rate
> 
>
> Key: FLUME-2712
> URL: https://issues.apache.org/jira/browse/FLUME-2712
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2712-1.patch, FLUME-2712.patch, FLUME-2712.patch
>
>
> When we have a source configured to deliver events to a main channel and an 
> optional channel, and if the delivery to optional channel fails, this 
> significantly slows down the rate at which events are delivered to the main 
> channel by the source.
> We need to evaluate async means of delivering events to the optional channel 
> and isolate the errors happening in optional channel from the delivery to the 
> main channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-944) Implement a Load-balancing channel selector for distributing the load over many channels.

2015-10-21 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968254#comment-14968254
 ] 

Hari Shreedharan commented on FLUME-944:


Sure - if it is something that interests you, there are probably others who'd 
like to see this as well. Please feel free to upload a patch.

> Implement a Load-balancing channel selector for distributing the load over 
> many channels.
> -
>
> Key: FLUME-944
> URL: https://issues.apache.org/jira/browse/FLUME-944
> Project: Flume
>  Issue Type: Improvement
>Reporter: Arvind Prabhakar
> Attachments: FLUME-944-1.patch
>
>
> The load balancing channel selector could distribute load via:
> * round-robin semantics, or
> * using dynamic load measurements from the channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2812) Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.Error: Maximum permit count exceeded

2015-10-14 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957697#comment-14957697
 ] 

Hari Shreedharan commented on FLUME-2812:
-

What does your configuration look like?

> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" 
> java.lang.Error: Maximum permit count exceeded
> --
>
> Key: FLUME-2812
> URL: https://issues.apache.org/jira/browse/FLUME-2812
> Project: Flume
>  Issue Type: Bug
>  Components: Channel, Sinks+Sources
>Affects Versions: v1.6.0
> Environment: **OS INFO**
> CentOS release 6.6 (Final)
> Kernel \r on an \m
> **JAVA INFO**
> java version "1.8.0_40"
> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>Reporter: Rollin Crittendon
>Priority: Critical
>
> We are finding that around after an hour or so of heavy processing of Flume 
> data in an agent we are getting the following exception.  This is after 
> processing about 5-7 k lines/second during that time.
> The configuration of this agent is using a Kafka source, the one that comes 
> with 1.6.0. 
> It is also using a Memory channel, and a Thrift sink.
> ===
> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" 
> java.lang.Error: Maximum permit count exceeded
>   at 
> java.util.concurrent.Semaphore$Sync.tryReleaseShared(Semaphore.java:192)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341)
>   at java.util.concurrent.Semaphore.release(Semaphore.java:609)
>   at 
> org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:147)
>   at 
> org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
>   at 
> org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:379)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:745)
> ===
> The above error is from standard error when running the Flume agent.  The 
> effect is that the "SinkRunner-PollingRunner-DefaultSinkProcessor" thread 
> disappears from the agent, this can be seen on a JMX console.
> For us, this means that the Flume agent needs to get restarted.  It is an 
> error that is terminal in that instance of the Java process due to the thread 
> disappearing as a result.
> It sounds like something in JDK 7+ got stricter?!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2712) Optional channel errors slows down the Source to Main channel event rate

2015-10-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949211#comment-14949211
 ] 

Hari Shreedharan commented on FLUME-2712:
-

In general, the idea looks ok. But there is one problem with this patch - the 
ordering of events in the optional channel is now messed up, since we are using 
a thread pool to do so. This is true even if there is exactly one source. This 
is a pretty obvious regression. I think we'd need to ensure ordering by 
actually having a single thread submitting the events - we should keep a 
blocking queue and have the thread poll that queue. 

> Optional channel errors slows down the Source to Main channel event rate
> 
>
> Key: FLUME-2712
> URL: https://issues.apache.org/jira/browse/FLUME-2712
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2712.patch, FLUME-2712.patch
>
>
> When we have a source configured to deliver events to a main channel and an 
> optional channel, and if the delivery to optional channel fails, this 
> significantly slows down the rate at which events are delivered to the main 
> channel by the source.
> We need to evaluate async means of delivering events to the optional channel 
> and isolate the errors happening in optional channel from the delivery to the 
> main channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2712) Optional channel errors slows down the Source to Main channel event rate

2015-10-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949215#comment-14949215
 ] 

Hari Shreedharan commented on FLUME-2712:
-

Also please add a test to ensure no data gets dropped

> Optional channel errors slows down the Source to Main channel event rate
> 
>
> Key: FLUME-2712
> URL: https://issues.apache.org/jira/browse/FLUME-2712
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2712.patch, FLUME-2712.patch
>
>
> When we have a source configured to deliver events to a main channel and an 
> optional channel, and if the delivery to optional channel fails, this 
> significantly slows down the rate at which events are delivered to the main 
> channel by the source.
> We need to evaluate async means of delivering events to the optional channel 
> and isolate the errors happening in optional channel from the delivery to the 
> main channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source

2015-10-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949671#comment-14949671
 ] 

Hari Shreedharan commented on FLUME-2781:
-

+1. LGTM. I am running the tests now. Will commit once the tests pass.

> A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used 
> by a Flume source
> -
>
> Key: FLUME-2781
> URL: https://issues.apache.org/jira/browse/FLUME-2781
> Project: Flume
>  Issue Type: Improvement
>Affects Versions: v1.6.0
>Reporter: Gonzalo Herreros
>  Labels: easyfix, patch
> Attachments: FLUME-2781.patch
>
>
> When a Kafka channel is configured as parseAsFlumeEvent=false, the channel 
> will read events from the topic as text instead of serialized Avro Flume 
> events.
> This is useful so Flume can read from an existing Kafka topic, where other 
> Kafka clients publish as text.
> However, if you use a Flume source on that channel, it will still write the 
> events as Avro so it will create an inconsistency and those events will fail 
> to be read correctly.
> Also, this would allow a Flume source to write to a Kafka channel and any 
> Kafka subscriber to listen to Flume events passing through without binary 
> dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2437) S3 Source

2015-10-01 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940710#comment-14940710
 ] 

Hari Shreedharan commented on FLUME-2437:
-

Nope. You reviewed it, if you think it is good to go, go ahead and push it.

> S3 Source
> -
>
> Key: FLUME-2437
> URL: https://issues.apache.org/jira/browse/FLUME-2437
> Project: Flume
>  Issue Type: New Feature
>Reporter: Jonathan Natkins
>Assignee: Johny Rufus
> Attachments: FLUME-2437-2.patch, FLUME-2437.patch
>
>
> There have been multiple requests on the mailing list for an S3 source



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2777) Tail Dir Source leads to duplicate events on rolling the tailed file

2015-09-24 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906781#comment-14906781
 ] 

Hari Shreedharan commented on FLUME-2777:
-

Running tests and committing.

> Tail Dir Source leads to duplicate events on rolling the tailed file
> 
>
> Key: FLUME-2777
> URL: https://issues.apache.org/jira/browse/FLUME-2777
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: notrack
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2777-1.patch, FLUME-2777.patch
>
>
> I have a simple setup, where I write 200 events to logfile1. [TailSrc is on 
> the lookout for logfile* ]
> Then I rename logfile1 to logfile2.
> I create a new logfile1 and write 100 events to it.
> Typically I should see 300 events in my channel. But I see 500 events.
> I was able to trace the duplicates to ReliableTaildirEventReader.java 
> updateFiles(boolean) to the way renamed files are handled , by specifying 
> starting position as 0. [This starting position should be obtained from 
> tf.getPosition()]
> I am attaching a proposed fix, would be great if one of you guys 
> [~iijima_satoshi] / [~hshreedharan]/ [~roshan_naik] can take a look at the 
> fix and validate the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2773) TailDirSource throws FileNotFound Exception if ~/.flume directory is not created already

2015-09-24 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907560#comment-14907560
 ] 

Hari Shreedharan commented on FLUME-2773:
-

+1. LGTM

> TailDirSource throws FileNotFound Exception if ~/.flume directory is not 
> created already
> 
>
> Key: FLUME-2773
> URL: https://issues.apache.org/jira/browse/FLUME-2773
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.7.0
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2773.patch
>
>
> If we leave the positionFile parameter to default, 
> then the following exception is thrown when ~/.flume is not present
> [We should take care of creating the default directory if not present ]
> 2015-08-27 09:44:30,551 (positionWriter) [ERROR - 
> org.apache.flume.source.taildir.TaildirSource.writePosition(TaildirSource.java:312)]
>  Failed writing positionFile
> java.io.FileNotFoundException: /Users/jrufus/.flume/taildir_position.json (No 
> such file or directory)
> at java.io.FileOutputStream.open(Native Method)
> at java.io.FileOutputStream.(FileOutputStream.java:221)
> at java.io.FileOutputStream.(FileOutputStream.java:171)
> at java.io.FileWriter.(FileWriter.java:90)
> at 
> org.apache.flume.source.taildir.TaildirSource.writePosition(TaildirSource.java:306)
> at 
> org.apache.flume.source.taildir.TaildirSource.access$600(TaildirSource.java:56)
> at 
> org.apache.flume.source.taildir.TaildirSource$PositionWriterRunnable.run(TaildirSource.java:298)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2794) Flume 1.6 HBase 1.12 java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)V

2015-09-23 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905673#comment-14905673
 ] 

Hari Shreedharan commented on FLUME-2794:
-

I think we actually need to make code changes to remove the old methods and 
switch to the new ones.

> Flume 1.6 HBase 1.12 java.lang.NoSuchMethodError: 
> org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)V
> --
>
> Key: FLUME-2794
> URL: https://issues.apache.org/jira/browse/FLUME-2794
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Emmanuel Leroy
>
> with Hbase 1.1.2 and Flume 1.6
> getting this error:
> ase_flume-9k2a4-1441998442358-ec273f9a-0-10152166], Starting
> 2015-09-11 19:07:28,037 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
> [ERROR - org.apache.flume.sink.hbase.HBaseSink.process(HBaseSink.java:351)] 
> Failed to commit transaction.Transaction rolled back.
> java.lang.NoSuchMethodError: 
> org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)V
>   at org.apache.flume.sink.hbase.HBaseSink$3.run(HBaseSink.java:377)
>   at org.apache.flume.sink.hbase.HBaseSink$3.run(HBaseSink.java:372)
>   at 
> org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
>   at 
> org.apache.flume.sink.hbase.HBaseSink.putEventsAndCommit(HBaseSink.java:372)
>   at org.apache.flume.sink.hbase.HBaseSink.process(HBaseSink.java:342)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-09-23 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905676#comment-14905676
 ] 

Hari Shreedharan commented on FLUME-2498:
-

This is already committed. Can you create a new jira and submit the patch there.

> Implement Taildir Source
> 
>
> Key: FLUME-2498
> URL: https://issues.apache.org/jira/browse/FLUME-2498
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Satoshi Iijima
> Fix For: v1.7.0
>
> Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, 
> FLUME-2498-4.patch, FLUME-2498-5.patch, FLUME-2498.patch
>
>
> This is the proposal of implementing a new tailing source.
> This source watches the specified files, and tails them in nearly real-time 
> once appends are detected to these files.
> * This source is reliable and will not miss data even when the tailing files 
> rotate.
> * It periodically writes the last read position of each file in a position 
> file using the JSON format.
> * If Flume is stopped or down for some reason, it can restart tailing from 
> the position written on the existing position file.
> * It can add event headers to each tailing file group. 
> A attached patch includes a config documentation of this.
> This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: JIRA versions

2015-09-19 Thread Hari Shreedharan
Got rid of 1.7 and move the jiras to v1.7.0. 




Thanks, Hari

On Sat, Sep 19, 2015 at 4:19 AM, Otis Gospodnetić
 wrote:

> Hi,
> Just added https://issues.apache.org/jira/browse/FLUME-2797 and set Fix
> Version to 1.7
> Now I see other issues used "v1.7.0" as Fix Version.  Looks like a little
> bit of a mess with different version formats and near-duplicate version
> labels.  Maybe one of the FLume JIRA admins can clean that up?
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/

Re: New Flafka component - kafka consumer channel

2015-08-27 Thread Hari Shreedharan
So one of the things that the already existing Kafka channel can do is to
run without a source. Does this outperform that as well? I have already
seen people use it this way.


Thanks,
Hari

On Thu, Aug 27, 2015 at 4:11 PM, Roshan Naik ros...@hortonworks.com wrote:

 Wanted to give a heads-up on this idea I have been working on …

 Using Flume as a Kafka producer or consumer has been gaining popularity
 thanks to the Flafka components that were recently introduced.

 For the use case of Flume as a Kafka consumer, it appears we can sidestep
 the compromise between Mem channel (which is fast but can lose data) and
  File channel (which is slow but won't lose data) and get the best of both
 worlds.

 I have a prototype of this idea  for a Kafka Consumer channel.  It is
 designed to enable the use of Flume as a really light weight and very fast
 Kafka consumer without the data loss potential of mem channel.  My
 measurements indicate it easily outperforms memory channel.

 Additional info here  …
 https://github.com/roshannaik/kafka-consumer-channel

 I think the same idea could be applied for Kafka producer channel.

 -roshan



Re: New Flafka component - kafka consumer channel

2015-08-27 Thread Hari Shreedharan
Nope. You can put anything you want, just set parseAsFlumeEvent to false
and the channel won't attempt to convert it into a Flume event. It just
stashes the whole thing into the body of the returned event.


Thanks,
Hari

On Thu, Aug 27, 2015 at 5:53 PM, Roshan Naik ros...@hortonworks.com wrote:

 My understanding is that the Kafka channel expects Flume Event objects
 to be stored in the Kafka topic.
 Isn't that right ?
 -roshan


 On 8/27/15 5:47 PM, Hari Shreedharan hshreedha...@cloudera.com wrote:

 So one of the things that the already existing Kafka channel can do is to
 run without a source. Does this outperform that as well? I have already
 seen people use it this way.
 
 
 Thanks,
 Hari
 
 On Thu, Aug 27, 2015 at 4:11 PM, Roshan Naik ros...@hortonworks.com
 wrote:
 
  Wanted to give a heads-up on this idea I have been working on Š
 
  Using Flume as a Kafka producer or consumer has been gaining popularity
  thanks to the Flafka components that were recently introduced.
 
  For the use case of Flume as a Kafka consumer, it appears we can
 sidestep
  the compromise between Mem channel (which is fast but can lose data) and
   File channel (which is slow but won't lose data) and get the best of
 both
  worlds.
 
  I have a prototype of this idea  for a Kafka Consumer channel.  It is
  designed to enable the use of Flume as a really light weight and very
 fast
  Kafka consumer without the data loss potential of mem channel.  My
  measurements indicate it easily outperforms memory channel.
 
  Additional info here  Š
  https://github.com/roshannaik/kafka-consumer-channel
 
  I think the same idea could be applied for Kafka producer channel.
 
  -roshan
 




[jira] [Commented] (FLUME-2765) ThriftSource spaws too many threads

2015-08-17 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700505#comment-14700505
 ] 

Hari Shreedharan commented on FLUME-2765:
-

I have a patch for this one ready. I will submit it soon

 ThriftSource spaws too many threads
 ---

 Key: FLUME-2765
 URL: https://issues.apache.org/jira/browse/FLUME-2765
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: 1.6
Reporter: Tobias Heintz
 Attachments: thread-dump-flume-1.6.txt


 We are in the process of migrating from the old Flume to version 1.6. We are 
 using the ThriftSource with the new KafkaSink. Here's what our config looks 
 like:
 {code}
 agent1.channels = ch1
 agent1.sources = thriftSrc
 agent1.sinks = kafka
 agent1.channels.ch1.type = memory
 agent1.channels.ch1.capacity = 1
 agent1.channels.ch1.transactionCapacity = 500
 # THRIFT
 agent1.sources.thriftSrc.type = thrift
 agent1.sources.thriftSrc.channels = ch1
 agent1.sources.thriftSrc.bind = 0.0.0.0
 agent1.sources.thriftSrc.port = 4042
 agent1.sources.thriftSrc.threads = 150 # if we don't set this option, the 
 source keeps creating more and more threads until all heap memory is used up 
 and then it crashes
 # KAFKA
 agent1.sinks.kafka.channel = ch1
 agent1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
 agent1.sinks.kafka.batchSize = 50
 agent1.sinks.kafka.brokerList = broker.example.com:9092
 agent1.sinks.kafka.requiredAcks = 1
 agent1.sinks.kafka.topic = topic1
 {code}
 We have been noticing some bad behavior by the Thrift source/Thrift server 
 using the JMX connection. If we don't restrict the number of threads, it 
 spawns thousands of new threads, apparently one for every message it 
 receives. These threads all have the name Flume Thrift IPC Thread [number] 
 and according to the jvisualvm console they are always idle. At some point 
 all of the JVM memory is used up through creating new threads and flume 
 crashes with the following exception:
 {code}
 12 Aug 2015 16:56:11,721 ERROR [Thread-1] 
 (org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run:544)  - 
 run() exiting due to uncaught error
 java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:714)
 at 
 java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
 at 
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
 at 
 org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(TThreadedSelectorServer.java:310)
 at 
 org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:209)
 at 
 org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.select(TThreadedSelectorServer.java:576)
 at 
 org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run(TThreadedSelectorServer.java:536)
 {code}
 When we set the option to restrict the number of threads, the server sticks 
 to that number and runs smoothly, however it drops messages occasionally (may 
 have a different cause).
 I am wondering whether this is a bug or in some way expected behavior? What 
 are the best practices for using a ThriftSource? Are there further parameters 
 to possibly tune (like channel.capacity)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2485) Thrift Source tests fail on Oracle JDK 8

2015-08-04 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654063#comment-14654063
 ] 

Hari Shreedharan commented on FLUME-2485:
-

+1. LGTM

 Thrift Source tests fail on Oracle JDK 8
 

 Key: FLUME-2485
 URL: https://issues.apache.org/jira/browse/FLUME-2485
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: Santiago M. Mola
Assignee: Miroslav Holubec
  Labels: jdk8
 Attachments: 
 0001-FLUME-2485-too-fast-processing-leads-to-failed-jUnit.patch, 
 FLUME-2485.patch


 Thrift Source tests fail on Oracle JDK 8:
 https://travis-ci.org/Stratio/flume/jobs/36817396#L6245
 testAppendBatch(org.apache.flume.source.TestThriftSource)  Time elapsed: 6083 
 sec   FAILURE!
 java.lang.AssertionError
   at org.junit.Assert.fail(Assert.java:92)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertTrue(Assert.java:54)
   at 
 org.apache.flume.source.TestThriftSource.testAppendBatch(TestThriftSource.java:144)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2485) Thrift Source tests fail on Oracle JDK 8

2015-08-04 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654175#comment-14654175
 ] 

Hari Shreedharan commented on FLUME-2485:
-

[~jrufus] Feel free to commit

 Thrift Source tests fail on Oracle JDK 8
 

 Key: FLUME-2485
 URL: https://issues.apache.org/jira/browse/FLUME-2485
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: Santiago M. Mola
Assignee: Miroslav Holubec
  Labels: jdk8
 Attachments: 
 0001-FLUME-2485-too-fast-processing-leads-to-failed-jUnit.patch, 
 FLUME-2485.patch


 Thrift Source tests fail on Oracle JDK 8:
 https://travis-ci.org/Stratio/flume/jobs/36817396#L6245
 testAppendBatch(org.apache.flume.source.TestThriftSource)  Time elapsed: 6083 
 sec   FAILURE!
 java.lang.AssertionError
   at org.junit.Assert.fail(Assert.java:92)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertTrue(Assert.java:54)
   at 
 org.apache.flume.source.TestThriftSource.testAppendBatch(TestThriftSource.java:144)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive test failures

2015-08-03 Thread Hari Shreedharan
Same for me. I don't see the failure locally. I don't have access to the
build machines, so is there a way we can create a temp dir and pass it into
the tests?

On Fri, Jul 31, 2015 at 4:18 PM, Roshan Naik ros...@hortonworks.com wrote:

 Hari,
   Speaking to some Hive devs, it looks like that /tmp/hive folder may have
 been created though some other mechanism with different permissions.
   Could u see if that folder can be deleted manually on the machines where
 this is running ?

   This failure does not happen when I run the test locally on my laptop.
 -roshan


 On 7/30/15 3:26 PM, Roshan Naik ros...@hortonworks.com wrote:

 Shall take a look at it it .. maybe tomorrow.
 -roshan
 
 
 On 7/29/15 8:00 PM, Hari Shreedharan hshreedha...@cloudera.com wrote:
 
 Hi,
 
 It looks like Hive tests have been failing for a while now:
 https://builds.apache.org/job/Flume-trunk-hbase-1/116/#showFailuresLink
 
 Any idea what is happening here - it seems like it is something related
 to
 the scratch directory not being writable.
 
 Thanks,
 Hari
 
 




[jira] [Commented] (FLUME-2660) Add documentation for EventValidator

2015-07-29 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646997#comment-14646997
 ] 

Hari Shreedharan commented on FLUME-2660:
-

[~ashishpaliwal]/[~jrufus] - Could one of you apply the latest patch to the 
site? I believe Ashish applied the previous one (since this is really available 
in 1.6)

 Add documentation for EventValidator
 

 Key: FLUME-2660
 URL: https://issues.apache.org/jira/browse/FLUME-2660
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.1
Reporter: Hari Shreedharan
Assignee: Ashish Paliwal
 Fix For: v1.7.0

 Attachments: FLUME-2660-0.patch, FLUME-2660-1.patch


 [~paliwalashish] - Assigning this to you. Please add docs for the 
 functionality you contributed in FLUME-2613



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2752) Flume AvroSoucr will leak the memory and the OOM will be happened.

2015-07-29 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647092#comment-14647092
 ] 

Hari Shreedharan commented on FLUME-2752:
-

Good catch! We should make sure we do shutdown the thread pool if the start 
fails.

 Flume AvroSoucr will leak the memory and the OOM will be happened.
 --

 Key: FLUME-2752
 URL: https://issues.apache.org/jira/browse/FLUME-2752
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: yinghua_zh

 If the flume agent config the nonexist IP for the avro source,the exception 
 will be happened as follow:
 2015-07-21 19:57:47,054 | ERROR | [lifecycleSupervisor-1-2] |  Unable to 
 start EventDrivenSourceRunner: { source:Avro source avro_source_21155: { 
 bindAddress: 51.196.27.32, port: 21155 } } - Exception follows.  | 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)
 org.jboss.netty.channel.ChannelException: Failed to bind to: 
 /51.196.27.32:21155
   at 
 org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:297)
   at org.apache.avro.ipc.NettyServer.init(NettyServer.java:106)
   at org.apache.flume.source.AvroSource.start(AvroSource.java:294)
   at 
 org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
   at 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.net.BindException: Cannot assign requested address
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.bind(NioServerSocketPipelineSink.java:140)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleServerSocket(NioServerSocketPipelineSink.java:90)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:64)
   at org.jboss.netty.channel.Channels.bind(Channels.java:569)
   at 
 org.jboss.netty.channel.AbstractChannel.bind(AbstractChannel.java:189)
   at 
 org.jboss.netty.bootstrap.ServerBootstrap$Binder.channelOpen(ServerBootstrap.java:342)
   at org.jboss.netty.channel.Channels.fireChannelOpen(Channels.java:170)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketChannel.init(NioServerSocketChannel.java:80)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketChannelFactory.newChannel(NioServerSocketChannelFactory.java:158)
   at 
 org.jboss.netty.channel.socket.nio.NioServerSocketChannelFactory.newChannel(NioServerSocketChannelFactory.java:86)
   at 
 org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:276)
 if the above exception happened for 2 hours,and the agent JVM -Xxx is 4G,the 
 OutOfMemory will be happened.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hive test failures

2015-07-29 Thread Hari Shreedharan
Hi,

It looks like Hive tests have been failing for a while now:
https://builds.apache.org/job/Flume-trunk-hbase-1/116/#showFailuresLink

Any idea what is happening here - it seems like it is something related to
the scratch directory not being writable.

Thanks,
Hari


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643839#comment-14643839
 ] 

Hari Shreedharan commented on FLUME-2749:
-

Committed! Thanks Johny!

 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.7.0

 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643812#comment-14643812
 ] 

Hari Shreedharan commented on FLUME-2749:
-

Looks good. Running tests now.

 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-07-22 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637926#comment-14637926
 ] 

Hari Shreedharan commented on FLUME-2498:
-

[~jrufus]/[~roshan_naik] - Do you think one of you would be able take a look at 
this one?

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2738) Async HBase sink FD leak on client shutdown

2015-07-09 Thread Hari Shreedharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan updated FLUME-2738:

Affects Version/s: v1.6.0
  Component/s: Sinks+Sources

 Async HBase sink FD leak on client shutdown
 ---

 Key: FLUME-2738
 URL: https://issues.apache.org/jira/browse/FLUME-2738
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.7.0

 Attachments: FLUME-2738-1.patch, FLUME-2738-2.patch, 
 FLUME-2738-3.patch, FLUME-2738.patch


 Currently every time Async Hbase Sink calls HBaseSink.shutdown, there is FD 
 leak due to HBaseSink using CustomChannelFactory where 
 releaseExternalResources() is overridden to a No-op. Need to replace this 
 with a standard NioClientSocketChannelFactory, that releases external 
 resources properly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2738) Async HBase sink FD leak on client shutdown

2015-07-09 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621108#comment-14621108
 ] 

Hari Shreedharan commented on FLUME-2738:
-

+1. Looks good to me.

 Async HBase sink FD leak on client shutdown
 ---

 Key: FLUME-2738
 URL: https://issues.apache.org/jira/browse/FLUME-2738
 Project: Flume
  Issue Type: Bug
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2738-1.patch, FLUME-2738-2.patch, 
 FLUME-2738-3.patch, FLUME-2738.patch


 Currently every time Async Hbase Sink calls HBaseSink.shutdown, there is FD 
 leak due to HBaseSink using CustomChannelFactory where 
 releaseExternalResources() is overridden to a No-op. Need to replace this 
 with a standard NioClientSocketChannelFactory, that releases external 
 resources properly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2738) Async HBase sink FD leak on client shutdown

2015-07-08 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619275#comment-14619275
 ] 

Hari Shreedharan commented on FLUME-2738:
-

I don't see how this fixes the issue. Tracing the {{releaseExternalResources}} 
calls only shows a bunch of buffers being cleared up. It is nice to clear up 
memory, but where are the sockets or FDs being closed/nulled out? I don't see 
that anywhere in the {{releaseExternalResources}} call chain.

 Async HBase sink FD leak on client shutdown
 ---

 Key: FLUME-2738
 URL: https://issues.apache.org/jira/browse/FLUME-2738
 Project: Flume
  Issue Type: Bug
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2738.patch


 Currently every time Async Hbase Sink calls HBaseSink.shutdown, there is FD 
 leak due to HBaseSink using CustomChannelFactory where 
 releaseExternalResources() is overridden to a No-op. Need to replace this 
 with a standard NioClientSocketChannelFactory, that releases external 
 resources properly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2713) Document Fault Tolerant Config parameters in FlumeUserGuide

2015-07-07 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617769#comment-14617769
 ] 

Hari Shreedharan commented on FLUME-2713:
-

[~ashishpaliwal] - Could you please explicitly +1 the jira before committing? 
Thanks!

 Document Fault Tolerant Config parameters in FlumeUserGuide
 ---

 Key: FLUME-2713
 URL: https://issues.apache.org/jira/browse/FLUME-2713
 Project: Flume
  Issue Type: Documentation
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.7.0

 Attachments: FLUME-2713.patch


 The following FaultTolerance related parameters in MorphlineSolrSink need to 
 be documented in Flume user guide
 FaultTolerance.IS_PRODUCTION_MODE
 FaultTolerance.IS_IGNORING_RECOVERABLE_EXCEPTIONS
 FaultTolerance.RECOVERABLE_EXCEPTION_CLASSES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2732) Make maximum tolerated failures before shutting down and recreating client in AsyncHbaseSink configurable

2015-07-07 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617950#comment-14617950
 ] 

Hari Shreedharan commented on FLUME-2732:
-

+1. Looks good

 Make maximum tolerated failures before shutting down and recreating client in 
 AsyncHbaseSink configurable
 -

 Key: FLUME-2732
 URL: https://issues.apache.org/jira/browse/FLUME-2732
 Project: Flume
  Issue Type: Bug
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2732.patch


 In AsyncHbaseSink, the maximum consecutive transaction failures, after which 
 we shutdown the HbaseClient  and recreate the client, is currently hardcoded 
 to 10. (This change was introduced to overcome a Memory leak in 
 AsyncHbaseClient)
 This needs to be configurable, and defaulted to 0 (Unlimited)
 The reason for this change is to overcome a bug in the AsyncHbaseClient, that 
 starts leaking File Descriptors on shutdown. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2672) NPE in KafkaSourceCounter

2015-07-06 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615958#comment-14615958
 ] 

Hari Shreedharan commented on FLUME-2672:
-

[~ashishpaliwal] - If this looks ok to you, do you mind committing it? Thanks!

 NPE in KafkaSourceCounter
 -

 Key: FLUME-2672
 URL: https://issues.apache.org/jira/browse/FLUME-2672
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
 Environment: Mac OS 10.10.3, Java 1.7.0_60
Reporter: Rigo MacTaggart
Priority: Trivial
  Labels: easyfix
 Attachments: FLUME-2672-with-test.patch, FLUME-2672.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 A NPE is thrown when KafkaSource calls counter.incrementKafkaEmptyCount() 
 because it expects MonitoredCounterGroup.counterMap to contain key 
 source.kafka.empty.count. A patch is included which adds this key to the 
 ATTRIBUTES string array, which is used to pre-populate keys with an initial 
 value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[ANNOUNCE] New Flume committer - Johny Rufus

2015-06-19 Thread Hari Shreedharan
On behalf of the Apache Flume PMC, I am excited to welcome Johny Rufus as a
committer on the Apache Flume project. Johny has actively contributed
several patches to the Flume project, including bug fixes, authentication
and other new features.

Congratulations and Welcome, Johny!


Cheers,
Hari Shreedharan


[jira] [Comment Edited] (FLUME-2721) Add support for custom serializer in Kafka Sink

2015-06-18 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591250#comment-14591250
 ] 

Hari Shreedharan edited comment on FLUME-2721 at 6/18/15 5:53 PM:
--

Kafka has its own serializers though I am not sure how that exactly works, 
since the type of our {{KeyedMessage}} instances are hard-coded, and they, I  
believe, have to match what the serializer expects? I'd rather have those 
serializers take care of it than add an additional layer.


was (Author: hshreedharan):
Kafka has its own serializers though I am not sure how that exactly works, 
since the type of our {{KeyedMessage}}s are hard-coded, and they, I  believe, 
have to match what the serializer expects? I'd rather have those serializers 
take care of it than add an additional layer.

 Add support for custom serializer in Kafka Sink
 ---

 Key: FLUME-2721
 URL: https://issues.apache.org/jira/browse/FLUME-2721
 Project: Flume
  Issue Type: Improvement
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: Benjamin Fiorini
 Attachments: FLUME-2721-0.patch


 In 1.6.0, the Kafka sink just sends the content of the body. It would be 
 really cool to be able to have a custom serializer.
 My use case is that I'd like to send some headers as well, or better send the 
 event serialized in JSON: {code}{headers:{...},body:...}{code}. There 
 could be: body serializer (current behaviour) and JSON serializer.
 Will try to submit a patch soon if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >