Re: [VOTE] Release Apache Flume Spring Boot 2.0.0-rc2

2023-04-02 Thread Mike Percy
+1 on the release

I didn’t test it but it looks like Ralph and Donát did, so here is my +1 to 
unblock the release.

Mike

Sent from my iPhone

> On Mar 30, 2023, at 9:45 PM, Ralph Goers  wrote:
> 
> Note that although this has 3 +1 votes it still needs one more +1 vote from 
> a PMC member. This vote has now been open for 10 days.
> 
> Ralph
> 
>> On Mar 30, 2023, at 8:12 AM, Ralph Goers  wrote:
>> 
>> This is my formal +1 on the release.
>> 
>> Ralph
>> 
 On Mar 20, 2023, at 12:48 PM, Ralph Goers  
 wrote:
>>> 
>>> This is a vote to release Flume Spring Boot 2.0.0. Flume Spring Boot has 
>>> moved from the full Flume release to its own repo. Note that the staging 
>>> web site has been updated to reflect the fix for Bug #1 below. Also note 
>>> that this repo supports the use of GitHub Issues for bug tracking. 
>>> 
>>> RC2 adds a missing sentence to the NOTICE file. The web site has been 
>>> updated to include a section for sub projects and Flume Spring Boot has 
>>> been added there. 
>>> 
>>> Please download, test, and cast your votes on the Flume developers list.
>>> [] +1, release the artifacts
>>> [] -1, don't release because...
>>> 
>>> The vote will remain open for 72 hours. All votes are welcome and we 
>>> encourage everyone to test the release, but only Flume PMC votes are 
>>> “officially” counted. As always, at least 3 +1 votes and more positive than 
>>> negative votes are required.
>>> 
>>> Changes in this release include:
>>> 
>>> ** Bug
>>> • [#1] - Require Applications to define a configuration class containing 
>>> the appropriate ComponentScan declaration in spring.factories.
>>> 
>>> 
>>> Tag: 
>>> a)  for a new copy do "git clone 
>>> https://github.com/apache/flume-spring-boot.git and then "git checkout 
>>> tags/release-2.0.0-rc2”  or just "git clone -b release-2.0.0-rc2 
>>> https://github.com/apache/flume-spring-boot.git;
>>> b) for an existing working copy to “git pull” and then “git checkout 
>>> tags/release-2.0.00-rc2”
>>> 
>>> Web Site:  https://flume.staged.apache.org/.  Specifically for Flume Spring 
>>> Boot - https://flume.staged.apache.org/flume-spring-boot/index.html.
>>> 
>>> Maven Artifacts: 
>>> https://repository.apache.org/content/repositories/orgapacheflume-1045.
>>> 
>>> Distribution archives: 
>>> https://dist.apache.org/repos/dist/dev/flume/flume-spring-boot 
>>> 
>>> You may download all the Maven artifacts by executing:
>>> wget -e robots=off --cut-dirs=7 -nH -r -p -np --no-check-certificate 
>>> https://repository.apache.org/content/repositories/orgapacheflume-1045/org/apache/flume/
>>> 
>>> Ralph
>> 
> 


Re: [ANNOUNCE] New Flume committer - Tristan Stevens

2020-01-20 Thread Mike Percy
Congrats Tristan! Welcome aboard.

Best,
Mike

On Mon, Jan 13, 2020 at 9:29 PM Bessenyei Balázs Donát 
wrote:

> On behalf of the Apache Flume PMC, I am very pleased to welcome Tristan
> Stevens
> as a committer on the Apache Flume project.
>
> Tristan has been contributing to Flume over an extended period of time.
> He has contributed many patches, almost all of which are fixes for bugs
> or important improvements.
>
> Congratulations and welcome, Tristan!
>
>
> Best,
> Donat
>


Re: Flume logo

2019-05-20 Thread Mike Percy
Hi Daniel,
I dug around a few years ago and unfortunately I could not find a vector
version of the Flume logo.

Out of the ones above, I like the current official logo the best. I admit
there is room for improvement but a contract with a professional graphic
designer (99 designs?) would likely be required to exceed the current bar
IMHO.

Best regards,
Mike

On Sun, May 19, 2019 at 12:09 PM Daniel Gruno  wrote:

> https://i.imgur.com/7MOMu0D.png has a few more options to pick from, as
> well as a comparison between original and suggested ones :)
>
> On 2019/05/19 17:02:18, Daniel Gruno  wrote:
> > Hi folks,
> > As part of maintaining the foundation-wide list of project logos, I was
> wondering if you had a vector version of the Flume logo, so we could get
> some high res versions of it?
> >
> > If not, I'd be happy to make an alternate/new logo for y'all, in line
> with the old one. I doodled a bit today and came up with
> https://i.imgur.com/seJ9FhV.png :) (available as svg also)
> >
> > With regards,
> > Daniel on behalf of Apache Central Services.
> >
>


Re: Better Marketing

2019-04-28 Thread Mike Percy
Great, sounds like you made progress on the perf thing. I’m not talking about 
other products Flume is bundled with, simply what the project ships with the 
binary artifacts at release time.

Mike

Sent from my iPhone

> On Apr 28, 2019, at 12:04 PM, Ralph Goers  wrote:
> 
> Yes, Mike. I understand that it is shipped with a product that uses it for 
> that purpose. To be honest, I have used Flume in 3 different projects so far 
> and none of them have integrated with Hadoop. I do have an upcoming project 
> that probably will, although Hadoop will probably only be one of the 
> destinations the data is delivered to. The others might be a third party SIEM 
> product as well as some kind of ELK stack, so even in that case Hadoop 
> wouldn’t be the primary “selling” point.
> 
> No, I haven’t done profiling yet. At this point my main focus is Log4j. Once 
> I get past that I can take a pass at profiling. It is possible the problem 
> might be in Log4j, but since the embedded Appender just constructs the event 
> and passes it to the Flume Embedded Agent I would be surprised if it is in 
> Log4j. However, while testing I did find one bug already in Log4j that was 
> causing a performance hit with Flume and have corrected that. 
> 
> Ralph
> 
>> On Apr 28, 2019, at 11:42 AM, Mike Percy  wrote:
>> 
>> I’d certainly be in favor of updating the project description to be more 
>> general. That said, part of Flume’s value proposition is integration with a 
>> bunch of components off the shelf and the main ones it ships are Hadoop 
>> ecosystem components, so we shouldn’t completely ignore that when describing 
>> the project.
>> 
>> Regarding the memory channel perf issues you observed, did you do any 
>> profiling? Do you think part of the issue could be Java GC? The memory 
>> channel tends to allocate and reclaim a lot of memory in a short period of 
>> time.
>> 
>> Mike
>> 
>> Sent from my iPhone
>> 
>>> On Apr 28, 2019, at 11:35 AM, Ralph Goers  
>>> wrote:
>>> 
>>> What I am seeing is that people go to the home page and cut the first 
>>> paragraph as a description of Flume. All I am really proposing is that we 
>>> change that to more effectively describe Flume. The description that is 
>>> there is accurate but minimal. I would just like to rephrase that paragraph 
>>> to give a more complete description of what Flume can be used for.
>>> 
>>> As an aside, I have been working on Log4j, Spring-Cloud-Config and docker. 
>>> In doing that I have done some crude benchmarking which you can see at 
>>> http://rgoers.github.io/log4j2-site/manual/cloud.html#Appender_Performance 
>>> <http://rgoers.github.io/log4j2-site/manual/cloud.html#Appender_Performance>.
>>>  I was quite surprised the performance of the Flume Embedded Appender with 
>>> a memory channel. I would have expected it to be more in line with the 
>>> Async Loggers and at the most in line with the Rolling File Appender since 
>>> the event is essentially handed to another thread to be processed.  It 
>>> would be nice to see Flume be able to recommended for use as a log 
>>> forwarder/aggregator for all apps with Docker instead of just when 
>>> guaranteed delivery is required and I would love to upgrade the Flume 
>>> documentation to describe how to do that.
>>> 
>>> Ralph
>>> 
>>>> On Apr 28, 2019, at 9:58 AM, Bessenyei Balázs Donát  
>>>> wrote:
>>>> 
>>>> I agree that marketing could be improved and I support finding a
>>>> slogan that represents best what Flume is today.
>>>> I am not sure about the wording that has been proposed, though. Can
>>>> you please elaborate, Ralph?
>>>> 
>>>> 
>>>> Thank you,
>>>> 
>>>> Donat
>>>> 
>>>>> On Sun, Apr 28, 2019 at 6:19 PM Ralph Goers  
>>>>> wrote:
>>>>> 
>>>>> When I read sites like 
>>>>> https://www.slant.co/versus/959/960/~fluentd_vs_flume 
>>>>> <https://www.slant.co/versus/959/960/~fluentd_vs_flume> I get a bit 
>>>>> discouraged at how people misunderstand Flume. Even a site like 
>>>>> https://www.predictiveanalyticstoday.com/data-ingestion-tools/ 
>>>>> <https://www.predictiveanalyticstoday.com/data-ingestion-tools/> is 
>>>>> misleading by copying our home page by just saying "Flume is a 
>>>>> distributed, reliable, and available service for efficiently collecting, 
>>>>&

Re: Better Marketing

2019-04-28 Thread Mike Percy
I’d certainly be in favor of updating the project description to be more 
general. That said, part of Flume’s value proposition is integration with a 
bunch of components off the shelf and the main ones it ships are Hadoop 
ecosystem components, so we shouldn’t completely ignore that when describing 
the project.

Regarding the memory channel perf issues you observed, did you do any 
profiling? Do you think part of the issue could be Java GC? The memory channel 
tends to allocate and reclaim a lot of memory in a short period of time.

Mike

Sent from my iPhone

> On Apr 28, 2019, at 11:35 AM, Ralph Goers  wrote:
> 
> What I am seeing is that people go to the home page and cut the first 
> paragraph as a description of Flume. All I am really proposing is that we 
> change that to more effectively describe Flume. The description that is there 
> is accurate but minimal. I would just like to rephrase that paragraph to give 
> a more complete description of what Flume can be used for.
> 
> As an aside, I have been working on Log4j, Spring-Cloud-Config and docker. In 
> doing that I have done some crude benchmarking which you can see at 
> http://rgoers.github.io/log4j2-site/manual/cloud.html#Appender_Performance 
> . 
> I was quite surprised the performance of the Flume Embedded Appender with a 
> memory channel. I would have expected it to be more in line with the Async 
> Loggers and at the most in line with the Rolling File Appender since the 
> event is essentially handed to another thread to be processed.  It would be 
> nice to see Flume be able to recommended for use as a log 
> forwarder/aggregator for all apps with Docker instead of just when guaranteed 
> delivery is required and I would love to upgrade the Flume documentation to 
> describe how to do that.
> 
> Ralph
> 
>> On Apr 28, 2019, at 9:58 AM, Bessenyei Balázs Donát  
>> wrote:
>> 
>> I agree that marketing could be improved and I support finding a
>> slogan that represents best what Flume is today.
>> I am not sure about the wording that has been proposed, though. Can
>> you please elaborate, Ralph?
>> 
>> 
>> Thank you,
>> 
>> Donat
>> 
>>> On Sun, Apr 28, 2019 at 6:19 PM Ralph Goers  
>>> wrote:
>>> 
>>> When I read sites like 
>>> https://www.slant.co/versus/959/960/~fluentd_vs_flume 
>>>  I get a bit 
>>> discouraged at how people misunderstand Flume. Even a site like 
>>> https://www.predictiveanalyticstoday.com/data-ingestion-tools/ 
>>>  is 
>>> misleading by copying our home page by just saying "Flume is a distributed, 
>>> reliable, and available service for efficiently collecting, aggregating, 
>>> and moving large amounts of log data” and then copying the image. This 
>>> leads users to believe that Flume is only useful in a small set of use 
>>> cases and is intimately tied to Hadoop.
>>> 
>>> I believe the home page should be changed to indicate say that "Flume is a 
>>> distributed, reliable, and available service for efficiently collecting, 
>>> aggregating, and streaming large amounts of data”, and then following up to 
>>> indicate that it is appropriate to use to move any kind of streaming data 
>>> such as application, audit, or system logs, real time events such as stock 
>>> quotes, or user transaction records.
>>> 
>>> The second sentence should also be modified to say "It is robust and fault 
>>> tolerant with tunable reliability mechanisms that can insure guaranteed 
>>> delivery and many failover and recovery mechanisms”.
>>> 
>>> I also think the very first image should be modified to not show just a web 
>>> application and HDFS as it seems to give people the impression that Flume 
>>> is only usable with Hadoop or in web applications. Unfortunately, only the 
>>> png seems to have been committed so redoing the diagram will mean starting 
>>> from scratch.
>>> 
>>> Thoughts?
>>> 
>>> Ralph
>> 
> 



[ANNOUNCE] Change of Apache Flume PMC Chair

2019-03-23 Thread Mike Percy
Dear Flume community,

I have had the opportunity to serve as the Flume PMC Chair for the last
year and some months, and for personal reasons have decided to step down at
this time.

I am very happy to announce that based on the PMC's recommendation, the
Apache Foundation board has appointed Ferenc Szabó to be the new PMC Chair
of the Apache Flume project. Ferenc has made significant contributions to
the project and is one of the most active contributors to the project. I am
confident that Ferenc will do everything possible to continue growing the
project and driving it forward.

Please join me in congratulating Ferenc on his appointment and welcoming
him to this role.

Regards,
Mike


Re: [ANNOUNCE] New Flume PMC member - Ferenc Szabo

2019-01-30 Thread Mike Percy
Congratulations Ferenc and welcome to the PMC! Thanks so much for all your
diligence and for running the 1.9.0 release!

Best regards,
Mike

On Wed, Jan 30, 2019 at 8:09 AM Denes Arvay  wrote:

> Hello Flume community,
>
> On behalf of the Apache Flume PMC I am pleased to announce that Ferenc
> Szabo (szaboferee) has accepted our invitation to become a PMC member on
> the Apache Flume project.
> Ferenc has been regularly contributing improvements to Flume including
> multiple bigger features/enhancements, initiated changes to help improving
> code quality. His main achievement recently was coordinating the 1.9
> release. During this process he not only managed to roll out the release
> but also went the extra mile by improving the tooling around the release
> process and by updating the documentation.
>
> We appreciate Ferenc stepping up to take more responsibility in the Flume
> project and we look forward to his continued contributions!
>
> Please join me in welcoming Ferenc to the Apache Flume PMC!
>
> Kind regards,
> Denes
>


Re: [NOTICE] Mandatory migration of git repositories to gitbox.apache.org

2019-01-08 Thread Mike Percy
Thanks Ferenc for filing this ticket!

Mike

> On Jan 8, 2019, at 6:56 AM, Ferenc Szabo  wrote:
> 
> I have created the migration ticket:
> https://issues.apache.org/jira/browse/INFRA-17586
> 
> 
> On Sat, Jan 5, 2019 at 12:10 PM Bessenyei Balázs Donát 
> wrote:
> 
>> I support the move and I think we should do it as soon as possible.
>> If there is anything I can do to support, let me know.
>> 
>> 
>> Donat
>> 
>>> On Sat, Jan 5, 2019 at 12:26 AM Mike Percy  wrote:
>>> 
>>> Hi Ferenc, OK sounds good. Let me know if you want help looking at any of
>>> the web site issues.
>>> 
>>> I agree that the gitbox site update should be done separately to avoid
>>> impacting the release and for ease of managing the changes.
>>> 
>>> I'm happy to create the INFRA ticket when the time comes, or if you want
>> to
>>> do it that's fine with me as well. It seems to me that unless we hear
>>> differently from someone else in the next day or so, we have lazy
>> consensus
>>> on asking ASF Infra to move Flume to gitbox after the 1.9.0 release.
>>> 
>>> Mike
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Jan 3, 2019 at 4:33 PM Ferenc Szabo 
>> wrote:
>>> 
>>>> Hi Mike,
>>>> 
>>>> +1 from me to the gitbox migration.
>>>> 
>>>> 
>>>> 
>>>> I am currently working on the website update. I just want to carefully
>>>> check the whole site to fix dead links and wrong asset URLs that are
>>>> present on the current deployed site.
>>>> It will be done soon. After that, I am going to do the announcement
>> and I
>>>> have already planned to handle the gitbox migration.
>>>> I would like to do it in a separate site update so the two updates do
>> not
>>>> hold back each other. It is not a big change.
>>>> 
>>>> 
>>>> When we have a consensus here on this thread then I can create the JIRA
>>>> ticket.
>>>> 
>>>> 
>>>> Regards,
>>>> Ferenc
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Thu, Jan 3, 2019 at 6:38 PM Mike Percy  wrote:
>>>>> 
>>>>> Hi Flume devs,
>>>>> I propose we ask ASF Infra to move the Flume Git repo to GitBox as
>> soon
>>>> as
>>>>> the release has been finalized / announced. Once they switch things
>> over,
>>>>> we can update the web site / documentation to reflect that.
>>>>> 
>>>>> Does anyone see any problems with this approach?
>>>>> 
>>>>> Ferenc, when do you expect to announce the release and push the
>> updated
>>>>> site / docs? We can coordinate the date with infra@ based on that.
>>>>> 
>>>>> Thanks,
>>>>> Mike
>>>>> 
>>>>> 
>>>>> On Thu, Jan 3, 2019 at 5:19 AM Apache Infrastructure Team <
>>>>> infrastruct...@apache.org> wrote:
>>>>> 
>>>>>> Hello, flume folks.
>>>>>> As stated earlier in 2018, all git repositories must be migrated
>> from
>>>>>> the git-wip-us.apache.org URL to gitbox.apache.org, as the old
>> service
>>>>>> is being decommissioned. Your project is receiving this email
>> because
>>>>>> you still have repositories on git-wip-us that needs to be
>> migrated.
>>>>>> 
>>>>>> The following repositories on git-wip-us belong to your project:
>>>>>> - flume.git
>>>>>> 
>>>>>> 
>>>>>> We are now entering the mandated (coordinated) move stage of the
>>>> roadmap,
>>>>>> and you are asked to please coordinate migration with the Apache
>>>>>> Infrastructure Team before February 7th. All repositories not
>> migrated
>>>>>> on February 7th will be mass migrated without warning, and we'd
>>>>> appreciate
>>>>>> it if we could work together to avoid a big mess that day :-).
>>>>>> 
>>>>>> Moving to gitbox means you will get full write access on GitHub as
>>>> well,
>>>>>> and be able to close/merge pull requests and much more.
>>>>>> 
>>>>>> To have your repositories moved, please follow these steps:
>>>>>> 
>>>>>> - Ensure consensus on the move (a link to a lists.apache.org
>> thread
>>>> will
>>>>>>  suffice for us as evidence).
>>>>>> - Create a JIRA ticket at
>> https://issues.apache.org/jira/browse/INFRA
>>>>>> 
>>>>>> Your migration should only take a few minutes. If you wish to
>> migrate
>>>>>> at a specific time of day or date, please do let us know in the
>> ticket.
>>>>>> 
>>>>>> As always, we appreciate your understanding and patience as we move
>>>>>> things around and work to provide better services and features for
>>>>>> the Apache Family.
>>>>>> 
>>>>>> Should you wish to contact us with feedback or questions, please
>> do so
>>>>>> at: us...@infra.apache.org.
>>>>>> 
>>>>>> 
>>>>>> With regards,
>>>>>> Apache Infrastructure
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 



Re: [ANNOUNCE] Apache Flume 1.9.0 released

2019-01-08 Thread Mike Percy
Nice! Thanks Ferenc for managing this release!

Regards,
Mike

> On Jan 8, 2019, at 6:34 AM, Ferenc Szabo  wrote:
> 
> The Apache Flume team is pleased to announce the release of Flume
> version 1.9.0.
> 
> Flume is a distributed, reliable, and available service for efficiently
> collecting, aggregating, and moving large amounts of log data.
> 
> This release can be downloaded from the Flume download page at:
> http://flume.apache.org/download.html
> 
> The change log and documentation are available on the 1.9.0 release page:
> http://flume.apache.org/releases/1.9.0.html
> 
> Your help and feedback is more than welcome. For more information on how
> to report problems and to get involved, visit the project website at
> http://flume.apache.org/
> 
> The Apache Flume Team


Re: [NOTICE] Mandatory migration of git repositories to gitbox.apache.org

2019-01-04 Thread Mike Percy
Hi Ferenc, OK sounds good. Let me know if you want help looking at any of
the web site issues.

I agree that the gitbox site update should be done separately to avoid
impacting the release and for ease of managing the changes.

I'm happy to create the INFRA ticket when the time comes, or if you want to
do it that's fine with me as well. It seems to me that unless we hear
differently from someone else in the next day or so, we have lazy consensus
on asking ASF Infra to move Flume to gitbox after the 1.9.0 release.

Mike




On Thu, Jan 3, 2019 at 4:33 PM Ferenc Szabo  wrote:

> Hi Mike,
>
> +1 from me to the gitbox migration.
>
>
>
> I am currently working on the website update. I just want to carefully
> check the whole site to fix dead links and wrong asset URLs that are
> present on the current deployed site.
> It will be done soon. After that, I am going to do the announcement and I
> have already planned to handle the gitbox migration.
> I would like to do it in a separate site update so the two updates do not
> hold back each other. It is not a big change.
>
>
> When we have a consensus here on this thread then I can create the JIRA
> ticket.
>
>
> Regards,
> Ferenc
>
>
>
>
>
> On Thu, Jan 3, 2019 at 6:38 PM Mike Percy  wrote:
>
> > Hi Flume devs,
> > I propose we ask ASF Infra to move the Flume Git repo to GitBox as soon
> as
> > the release has been finalized / announced. Once they switch things over,
> > we can update the web site / documentation to reflect that.
> >
> > Does anyone see any problems with this approach?
> >
> > Ferenc, when do you expect to announce the release and push the updated
> > site / docs? We can coordinate the date with infra@ based on that.
> >
> > Thanks,
> > Mike
> >
> >
> > On Thu, Jan 3, 2019 at 5:19 AM Apache Infrastructure Team <
> > infrastruct...@apache.org> wrote:
> >
> > > Hello, flume folks.
> > > As stated earlier in 2018, all git repositories must be migrated from
> > > the git-wip-us.apache.org URL to gitbox.apache.org, as the old service
> > > is being decommissioned. Your project is receiving this email because
> > > you still have repositories on git-wip-us that needs to be migrated.
> > >
> > > The following repositories on git-wip-us belong to your project:
> > >  - flume.git
> > >
> > >
> > > We are now entering the mandated (coordinated) move stage of the
> roadmap,
> > > and you are asked to please coordinate migration with the Apache
> > > Infrastructure Team before February 7th. All repositories not migrated
> > > on February 7th will be mass migrated without warning, and we'd
> > appreciate
> > > it if we could work together to avoid a big mess that day :-).
> > >
> > > Moving to gitbox means you will get full write access on GitHub as
> well,
> > > and be able to close/merge pull requests and much more.
> > >
> > > To have your repositories moved, please follow these steps:
> > >
> > > - Ensure consensus on the move (a link to a lists.apache.org thread
> will
> > >   suffice for us as evidence).
> > > - Create a JIRA ticket at https://issues.apache.org/jira/browse/INFRA
> > >
> > > Your migration should only take a few minutes. If you wish to migrate
> > > at a specific time of day or date, please do let us know in the ticket.
> > >
> > > As always, we appreciate your understanding and patience as we move
> > > things around and work to provide better services and features for
> > > the Apache Family.
> > >
> > > Should you wish to contact us with feedback or questions, please do so
> > > at: us...@infra.apache.org.
> > >
> > >
> > > With regards,
> > > Apache Infrastructure
> > >
> > >
> >
>


Re: [NOTICE] Mandatory migration of git repositories to gitbox.apache.org

2019-01-03 Thread Mike Percy
Hi Flume devs,
I propose we ask ASF Infra to move the Flume Git repo to GitBox as soon as
the release has been finalized / announced. Once they switch things over,
we can update the web site / documentation to reflect that.

Does anyone see any problems with this approach?

Ferenc, when do you expect to announce the release and push the updated
site / docs? We can coordinate the date with infra@ based on that.

Thanks,
Mike


On Thu, Jan 3, 2019 at 5:19 AM Apache Infrastructure Team <
infrastruct...@apache.org> wrote:

> Hello, flume folks.
> As stated earlier in 2018, all git repositories must be migrated from
> the git-wip-us.apache.org URL to gitbox.apache.org, as the old service
> is being decommissioned. Your project is receiving this email because
> you still have repositories on git-wip-us that needs to be migrated.
>
> The following repositories on git-wip-us belong to your project:
>  - flume.git
>
>
> We are now entering the mandated (coordinated) move stage of the roadmap,
> and you are asked to please coordinate migration with the Apache
> Infrastructure Team before February 7th. All repositories not migrated
> on February 7th will be mass migrated without warning, and we'd appreciate
> it if we could work together to avoid a big mess that day :-).
>
> Moving to gitbox means you will get full write access on GitHub as well,
> and be able to close/merge pull requests and much more.
>
> To have your repositories moved, please follow these steps:
>
> - Ensure consensus on the move (a link to a lists.apache.org thread will
>   suffice for us as evidence).
> - Create a JIRA ticket at https://issues.apache.org/jira/browse/INFRA
>
> Your migration should only take a few minutes. If you wish to migrate
> at a specific time of day or date, please do let us know in the ticket.
>
> As always, we appreciate your understanding and patience as we move
> things around and work to provide better services and features for
> the Apache Family.
>
> Should you wish to contact us with feedback or questions, please do so
> at: us...@infra.apache.org.
>
>
> With regards,
> Apache Infrastructure
>
>


Re: [VOTE] Release Apache Flume version 1.9.0 RC3

2018-12-19 Thread Mike Percy
+1 on RC3

 - Sigs and checksums match for source and binary artifacts
 - Source artifact matches git tag
 - LICENSE file looks good
 - README looks good
 - Source artifact builds and tests pass on Ubuntu bionic
 - Binary artifact runs on Ubuntu bionic with minimal configuration

Thanks for creating the new build, Ferenc!

Mike

On Mon, Dec 17, 2018 at 12:41 PM Ferenc Szabo  wrote:

> Dear Flume Community,
>
> This is the 12th release for Apache Flume as a top-level project,
> version 1.9.0. We are voting on release candidate RC3.
>
> It fixes the following issues:
>
> https://raw.githubusercontent.com/apache/flume/release-1.9.0-rc3/CHANGELOG
>
> *** Please cast your vote within the next 72 hours ***
>
> The tarball (*.tar.gz), signature (*.asc), and checksums (*.sha512)
> for the source and binary artifacts can be found here:
>   http://people.apache.org/~szaboferee/apache-flume-1.9.0-rc3/
>
> Maven staging repo:
>   https://repository.apache.org/content/repositories/orgapacheflume-1032/
>
> The tag to be voted on:
>
>
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=d4fcab4f501d41597bc616921329a4339f73585e
>
> Flume's KEYS file containing PGP keys we use to sign the release:
>   https://svn.apache.org/repos/asf/flume/dist/KEYS
>
>
> Regards,
> Ferenc
>


Re: [VOTE] Release Apache Flume version 1.9.0 RC2

2018-12-14 Thread Mike Percy
Sorry for the delay in voting.

-1 on RC2 because the source artifact should match the tag, but it has a
.git folder in there:

$ diff -r apache-flume-1.9.0-src src-tag
Only in apache-flume-1.9.0-src: .git

I am not sure why we don't just use the scripts designed to generate source
artifacts (dev-support/generate-source-release.sh)

Other than that problem, everything else seems to look good:
 - Checksums and sigs match
 - README and LICENSE look good
 - Binary artifact runs
 - Source artifact builds and tests pass on Ubuntu bionic

Thanks,
Mike


On Sun, Dec 9, 2018 at 8:26 PM Ferenc Szabo  wrote:

> Dear Flume Community,
>
> This is the 12th release for Apache Flume as a top-level project,
> version 1.9.0. We are voting on release candidate RC1.
>
> It fixes the following issues:
>
> https://raw.githubusercontent.com/apache/flume/release-1.9.0-rc2/CHANGELOG
>
> *** Please cast your vote within the next 72 hours ***
>
> The tarball (*.tar.gz), signature (*.asc), and checksums (*.sha512)
> for the source and binary artifacts can be found here:
>   http://people.apache.org/~szaboferee/apache-flume-1.9.0-rc2/
>
> Maven staging repo:
>   https://repository.apache.org/content/repositories/orgapacheflume-1031/
>
> The tag to be voted on:
>
>
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=cb8f1f576d365641030499b434d1a0629f46f884
>
> Flume's KEYS file containing PGP keys we use to sign the release:
>   https://svn.apache.org/repos/asf/flume/dist/KEYS
>
>
> Regards,
> Ferenc
>


Re: [VOTE] Release Apache Flume version 1.9.0 RC1

2018-12-06 Thread Mike Percy
Thanks Ferenc for putting this together!

I agree with Denes that we should ensure the tag and source artifact match
exactly.

There shouldn't be any extra files if we
use ./dev-support/generate-source-release.sh to generate the source release
files because that script uses git archive to generate the source release
from the tag. Although I just tried it and it looks
like dev-support/sign-checksum-artifact.sh needs a chmod +x that isn't
checked in at the moment, which should be trivial.

Other than that:
 - Checksums and sigs look good.
 - README, LICENSE files look good.
 - I was able to build the source artifact on Mac

Mike

On Wed, Dec 5, 2018 at 4:16 AM Denes Arvay  wrote:

> Hi Ferenc,
>
> Thank you for creating the first release candidate.
>
> I did the following checks:
> 1) checked the checksums & the signatures - OK
> 2) compared the contents of the src.tar.gz with the repository @
> release-1.9.0-rc1: here I found the following unnecessary files (they are
> not in the repository):
> - ./${project.basedir}
> - ./.mvn/wrapper/maven-wrapper.jar
> - ./flume-checkstyle
> - ./flume-ng-sinks/flume-hive-sink/derby.log
> 2b) after removing these files and directories I verified that the content
> of the src.tar.gz matches the content of the repository. I used the
> following command: find . -type f | grep -v ".git" | sort | xargs cat |
> shasum
> 3) compiled the contents of the src.tar.gz, it was successful as expected.
> 4) compared the jars in the bin.tar.gz with the jars created by mvn install
> in the src with jardiff [1] and there were only metadata diffs due to the
> different environment
>
> Due to the issues found in 2) I'd vote a -1 for this RC and ask Ferenc to
> go ahead with creating an RC2.
> Plus I found some outdated parts in the User Guide, I've opened a pull
> request [2] to remove those, I think it would be worth to include that
> change in 1.9 too.
>
> Thanks,
> Denes
>
> [1] https://github.com/scala/jardiff
> [2] https://github.com/apache/flume/pull/255
>
>
> On Tue, Dec 4, 2018 at 6:09 PM Ferenc Szabo  wrote:
>
> > Dear Flume Community,
> >
> > This is the 12th release for Apache Flume as a top-level project,
> > version 1.9.0. We are voting on release candidate RC1.
> >
> > It fixes the following issues:
> >
> >
> https://raw.githubusercontent.com/apache/flume/release-1.9.0-rc1/CHANGELOG
> >
> > *** Please cast your vote within the next 72 hours ***
> >
> > The tarball (*.tar.gz), signature (*.asc), and checksums (*.sha512)
> > for the source and binary artifacts can be found here:
> >   http://people.apache.org/~szaboferee/apache-flume-1.9.0-rc1/
> >
> > Maven staging repo:
> >
> https://repository.apache.org/content/repositories/orgapacheflume-1029/
> >
> > The tag to be voted on:
> >
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=4fcf23d7eeecebcad39995bdb8dcfeb03453273b
> >
> > Flume's KEYS file containing PGP keys we use to sign the release:
> >   https://svn.apache.org/repos/asf/flume/dist/KEYS
> >
> > Regards,
> > Ferenc
> >
>


Re: [DISCUSS] Flume 1.9 release proposal

2018-11-30 Thread Mike Percy
Hi Ralph,
Let’s update the release notes to indicate that it is coming and do it in the 
next release so people are forewarned of the change. After speaking with others 
it sounds like Flume users are willing to accept the change (some have already 
had to do it because of Solr) so I have changed my view regarding whether we 
should wait for Flume 2.0 to do it. I’m now on board doing the log4j2 switch in 
a 1.10 release if that is what we want to release next time.

Mike

> On Nov 30, 2018, at 2:20 PM, Ralph Goers  wrote:
> 
> Mike, 
> 
> This response confuses me. Are you saying that because the customers are good 
> with the config change Flume 1.9 can go out with Log4j 2 or are you saying 
> you are good with updating the release notes to say it will happen in the 
> next release?
> 
> Ralph
> 
>> On Nov 29, 2018, at 10:53 PM, Mike Percy  wrote:
>> 
>> After speaking with a couple of large users of Flume, because log4j2 is ABI 
>> compatible with log4j1 It sounds like the config change will be acceptable 
>> to them at upgrade time and so I am +1 on this proposal.
>> 
>> Regards,
>> Mike
>> 
>>> On Nov 28, 2018, at 5:37 AM, Ralph Goers  wrote:
>>> 
>>> Again, please update the release notes to indicate the logging 
>>> configuration change will be coming with the next release.
>>> 
>>> Ralph
>>> 
>>>> On Nov 28, 2018, at 1:17 AM, Denes Arvay  
>>>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> I agree that the current content doesn't justify the 2.0 release, so I'd
>>>> vote for moving forward with the 1.9 as Ferenc recommended, i.e. without
>>>> the 2 previously mentioned breaking changes.
>>>> I also agree with the proposed new features, especially with the plugin
>>>> system/modularization, which was discussed on the other thread:
>>>> https://s.apache.org/2nm9
>>>> 
>>>> So, Ferenc, I think you are good to go with the proposed plan, unless
>>>> anybody vetoes.
>>>> 
>>>> Regards,
>>>> Denes
>>>> 
>>>> On Tue, Nov 27, 2018 at 4:37 PM Ralph Goers 
>>>> wrote:
>>>> 
>>>>> While Log4j 2 supports using properties files for configuration it is not
>>>>> syntactically compatible with Log4j 1. However, migrating a Log4j 1
>>>>> configuration to Log4j 2 isn’t a difficult task.
>>>>> 
>>>>> Ralph
>>>>> 
>>>>>> On Nov 27, 2018, at 4:38 AM, Tristan Stevens
>>>>>  wrote:
>>>>>> 
>>>>>> I did start thinking a while back about a REST API that could be used for
>>>>>> configuring Flume, after which you could potentially bolt on a UI. I
>>>>>> stopped work on it because it got tricky around configuring sources (and
>>>>>> the way in which they relate to channels etc). If someone had more time
>>>>> I’d
>>>>>> be happy to push to a repo the work that I did. This could be a really
>>>>>> useful 2.0 feature.
>>>>>> 
>>>>>> Regarding move from Log4j1.x to 2, it’d be interesting to see what other
>>>>>> Apache projects, such as Hadoop, Hive etc did around this (in fact, are
>>>>>> they even at 2.x yet?). For me, if you need to change config, it’s not a
>>>>>> minor release, it become major, unless there’s a way you can
>>>>> automatically
>>>>>> migrate config from one to the other. I’ve not checked, but are we sure
>>>>>> that .properties file won’t work with log4j2?
>>>>>> 
>>>>>> Tristan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 26 November 2018 at 19:54:01, Mike Percy (mpe...@apache.org) wrote:
>>>>>> 
>>>>>> Yeah, I agree with this, especially classloader isolation would be great
>>>>> to
>>>>>> have as well on the plugin side.
>>>>>> 
>>>>>> Mike
>>>>>> 
>>>>>> On Mon, Nov 26, 2018 at 11:50 AM Ralph Goers >>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> I think people want to do more than just upgrade Log4j for a Flume 2.0
>>>>>>> release. I would like to make configuration much more pluggable for
>>>>>>> example. There was also talk about splitting all the sources, sinks,
>>>>&g

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-29 Thread Mike Percy
After speaking with a couple of large users of Flume, because log4j2 is ABI 
compatible with log4j1 It sounds like the config change will be acceptable to 
them at upgrade time and so I am +1 on this proposal.

Regards,
Mike

> On Nov 28, 2018, at 5:37 AM, Ralph Goers  wrote:
> 
> Again, please update the release notes to indicate the logging configuration 
> change will be coming with the next release.
> 
> Ralph
> 
>> On Nov 28, 2018, at 1:17 AM, Denes Arvay  wrote:
>> 
>> Hi All,
>> 
>> I agree that the current content doesn't justify the 2.0 release, so I'd
>> vote for moving forward with the 1.9 as Ferenc recommended, i.e. without
>> the 2 previously mentioned breaking changes.
>> I also agree with the proposed new features, especially with the plugin
>> system/modularization, which was discussed on the other thread:
>> https://s.apache.org/2nm9
>> 
>> So, Ferenc, I think you are good to go with the proposed plan, unless
>> anybody vetoes.
>> 
>> Regards,
>> Denes
>> 
>> On Tue, Nov 27, 2018 at 4:37 PM Ralph Goers 
>> wrote:
>> 
>>> While Log4j 2 supports using properties files for configuration it is not
>>> syntactically compatible with Log4j 1. However, migrating a Log4j 1
>>> configuration to Log4j 2 isn’t a difficult task.
>>> 
>>> Ralph
>>> 
>>>> On Nov 27, 2018, at 4:38 AM, Tristan Stevens
>>>  wrote:
>>>> 
>>>> I did start thinking a while back about a REST API that could be used for
>>>> configuring Flume, after which you could potentially bolt on a UI. I
>>>> stopped work on it because it got tricky around configuring sources (and
>>>> the way in which they relate to channels etc). If someone had more time
>>> I’d
>>>> be happy to push to a repo the work that I did. This could be a really
>>>> useful 2.0 feature.
>>>> 
>>>> Regarding move from Log4j1.x to 2, it’d be interesting to see what other
>>>> Apache projects, such as Hadoop, Hive etc did around this (in fact, are
>>>> they even at 2.x yet?). For me, if you need to change config, it’s not a
>>>> minor release, it become major, unless there’s a way you can
>>> automatically
>>>> migrate config from one to the other. I’ve not checked, but are we sure
>>>> that .properties file won’t work with log4j2?
>>>> 
>>>> Tristan
>>>> 
>>>> 
>>>> 
>>>> On 26 November 2018 at 19:54:01, Mike Percy (mpe...@apache.org) wrote:
>>>> 
>>>> Yeah, I agree with this, especially classloader isolation would be great
>>> to
>>>> have as well on the plugin side.
>>>> 
>>>> Mike
>>>> 
>>>> On Mon, Nov 26, 2018 at 11:50 AM Ralph Goers >>> 
>>>> wrote:
>>>> 
>>>>> I think people want to do more than just upgrade Log4j for a Flume 2.0
>>>>> release. I would like to make configuration much more pluggable for
>>>>> example. There was also talk about splitting all the sources, sinks,
>>>>> interceptors, etc out of the core module and making them some sort of
>>>>> plugin. At this point those are just at the idea stage.
>>>>> 
>>>>>> On Nov 26, 2018, at 12:19 PM, Bessenyei Balázs Donát <
>>> bes...@apache.org>
>>>> 
>>>>> wrote:
>>>>>> 
>>>>>> I wonder, is there anything blocking us from releasing 2.0 next instead
>>>>> of 1.x?
>>>>>> 
>>>>>> 
>>>>>> Donat
>>>>>> 
>>>>>>> On Mon, Nov 26, 2018 at 6:27 PM Mike Percy  wrote:
>>>>>>> 
>>>>>>> If only the PMC would be affected by this decision, we could have had
>>>>> this
>>>>>>> discussion on the PMC private list. But this decision impacts
>>>> everybody
>>>>>>> that uses Flume. So let's hear from anybody who cares about this,
>>>>> including
>>>>>>> committers, contributors, and users on whether they are okay with
>>>>> switching
>>>>>>> to log4j2 in a minor release version, knowing that they will need to
>>>>> change
>>>>>>> their config files when they upgrade Flume.
>>>>>>> 
>>>>>>> Ferenc, it seems like we will have to ship one or the other with the
>>>>> binary

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-26 Thread Mike Percy
Yeah, I agree with this, especially classloader isolation would be great to
have as well on the plugin side.

Mike

On Mon, Nov 26, 2018 at 11:50 AM Ralph Goers 
wrote:

> I think people want to do more than just upgrade Log4j for a Flume 2.0
> release. I would like to make configuration much more pluggable for
> example. There was also talk about splitting all the sources, sinks,
> interceptors, etc out of the core module and making them some sort of
> plugin. At this point those are just at the idea stage.
>
> > On Nov 26, 2018, at 12:19 PM, Bessenyei Balázs Donát 
> wrote:
> >
> > I wonder, is there anything blocking us from releasing 2.0 next instead
> of 1.x?
> >
> >
> > Donat
> >
> > On Mon, Nov 26, 2018 at 6:27 PM Mike Percy  wrote:
> >>
> >> If only the PMC would be affected by this decision, we could have had
> this
> >> discussion on the PMC private list. But this decision impacts everybody
> >> that uses Flume. So let's hear from anybody who cares about this,
> including
> >> committers, contributors, and users on whether they are okay with
> switching
> >> to log4j2 in a minor release version, knowing that they will need to
> change
> >> their config files when they upgrade Flume.
> >>
> >> Ferenc, it seems like we will have to ship one or the other with the
> binary
> >> artifacts at release time. It seems to me that we still have to make a
> >> choice about the built and shipped default, even if both could work at
> >> runtime, right?
> >>
> >> Thanks,
> >> Mike
> >>
> >> On Mon, Nov 26, 2018 at 5:58 AM Ferenc Szabo
> 
> >> wrote:
> >>
> >>> I have directed it to the community, the PMC members.
> >>> I am looking for a decision from the PMC members (including You) on
> whether
> >>> we should continue as planned or not removing log4j2 form the minor
> release
> >>> branch.
> >>>
> >>> On Mon, Nov 26, 2018 at 2:43 PM Ralph Goers <
> ralph.go...@dslextreme.com>
> >>> wrote:
> >>>
> >>>> Was this directed at me? I am not sure what decision you are referring
> >>> to.
> >>>> I stated my position in the email you replied to.
> >>>>
> >>>> Ralph
> >>>>
> >>>>> On Nov 26, 2018, at 5:29 AM, Ferenc Szabo
>  >>>>
> >>>> wrote:
> >>>>>
> >>>>> With the new release, you can just replace the jars on the classpath
> as
> >>>> we
> >>>>> removed every code dependency and using slf4j. So you do not need to
> >>> fork
> >>>>> and change anything.
> >>>>>
> >>>>> Let me know your decision and we will continue with that.
> >>>>>
> >>>>> Regards,
> >>>>> Ferenc
> >>>>>
> >>>>> On Mon, Nov 26, 2018 at 12:10 PM Apache 
> >>>> wrote:
> >>>>>
> >>>>>> As I said, I am fine with adding a notice to the release notes
> stating
> >>>> the
> >>>>>> logging will be changed in the next release regardless of its
> version
> >>>>>> number. This way the revert only needs to happen on the release
> branch
> >>>> for
> >>>>>> this release.
> >>>>>>
> >>>>>> Ralph
> >>>>>>
> >>>>>>> On Nov 25, 2018, at 9:59 PM, Mike Percy  wrote:
> >>>>>>>
> >>>>>>> While the committer veto rules are well documented here
> >>>>>>> <https://www.apache.org/foundation/voting.html#Veto>, and there is
> >>> no
> >>>>>>> mention of a time limit, I propose we shelve that discussion and
> work
> >>>> on
> >>>>>>> getting to consensus on what we as a community want to include in
> the
> >>>>>> Flume
> >>>>>>> 1.9 release.
> >>>>>>>
> >>>>>>> As far as I can tell, the decision we have to make is whether to
> >>>> include
> >>>>>>> the FLUME-2050 logging changes in the Flume 1.9 release. We can
> >>> either:
> >>>>>>>
> >>>>>>> 1) Require users upgrading from Flume 1.8 to Flume 1.9 to have to
> >>>> modify
> >>>>>&

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-26 Thread Mike Percy
If only the PMC would be affected by this decision, we could have had this
discussion on the PMC private list. But this decision impacts everybody
that uses Flume. So let's hear from anybody who cares about this, including
committers, contributors, and users on whether they are okay with switching
to log4j2 in a minor release version, knowing that they will need to change
their config files when they upgrade Flume.

Ferenc, it seems like we will have to ship one or the other with the binary
artifacts at release time. It seems to me that we still have to make a
choice about the built and shipped default, even if both could work at
runtime, right?

Thanks,
Mike

On Mon, Nov 26, 2018 at 5:58 AM Ferenc Szabo 
wrote:

> I have directed it to the community, the PMC members.
> I am looking for a decision from the PMC members (including You) on whether
> we should continue as planned or not removing log4j2 form the minor release
> branch.
>
> On Mon, Nov 26, 2018 at 2:43 PM Ralph Goers 
> wrote:
>
> > Was this directed at me? I am not sure what decision you are referring
> to.
> > I stated my position in the email you replied to.
> >
> > Ralph
> >
> > > On Nov 26, 2018, at 5:29 AM, Ferenc Szabo  >
> > wrote:
> > >
> > > With the new release, you can just replace the jars on the classpath as
> > we
> > > removed every code dependency and using slf4j. So you do not need to
> fork
> > > and change anything.
> > >
> > > Let me know your decision and we will continue with that.
> > >
> > > Regards,
> > > Ferenc
> > >
> > > On Mon, Nov 26, 2018 at 12:10 PM Apache 
> > wrote:
> > >
> > >> As I said, I am fine with adding a notice to the release notes stating
> > the
> > >> logging will be changed in the next release regardless of its version
> > >> number. This way the revert only needs to happen on the release branch
> > for
> > >> this release.
> > >>
> > >> Ralph
> > >>
> > >>> On Nov 25, 2018, at 9:59 PM, Mike Percy  wrote:
> > >>>
> > >>> While the committer veto rules are well documented here
> > >>> <https://www.apache.org/foundation/voting.html#Veto>, and there is
> no
> > >>> mention of a time limit, I propose we shelve that discussion and work
> > on
> > >>> getting to consensus on what we as a community want to include in the
> > >> Flume
> > >>> 1.9 release.
> > >>>
> > >>> As far as I can tell, the decision we have to make is whether to
> > include
> > >>> the FLUME-2050 logging changes in the Flume 1.9 release. We can
> either:
> > >>>
> > >>> 1) Require users upgrading from Flume 1.8 to Flume 1.9 to have to
> > modify
> > >>> their log4j configuration files to use the different log4j2 XML
> format,
> > >>> with the procedure for doing so documented in the release notes.
> > >>> -or-
> > >>> 2) Defer that change to a future release where incompatible changes
> are
> > >>> expected, such as Flume 2.0.
> > >>>
> > >>> Also, maybe there are other options I haven't thought of...?
> > >>>
> > >>> I would like to get some input from more people on this matter. How
> do
> > >>> others feel about this?
> > >>>
> > >>> Thanks,
> > >>> Mike
> > >>>
> > >>>
> > >>> On Sat, Nov 24, 2018 at 9:36 PM Ralph Goers <
> > ralph.go...@dslextreme.com>
> > >>> wrote:
> > >>>
> > >>>> I should also point out that the time to raise this was a year ago
> > when
> > >>>> the PR for FLUME-2050 was reviewed and committed. As it stands now I
> > >> could
> > >>>> be a jerk and vote -1 on the patch for FLUME-3296 with valid
> technical
> > >>>> grounds. If this was causing a true binary incompatibility I would
> > >> approve
> > >>>> reverting it in a heartbeat, but I just don’t see how having users
> > have
> > >> to
> > >>>> change logging configuration is “intolerable”, especially with the
> > known
> > >>>> security issues in Log4j 1, however unlikely user’s might be to
> > >> encounter
> > >>>> them.
> > >>>>
> > >>>> That said, I wouldn’t veto the revert on the release branch, but 

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-25 Thread Mike Percy
While the committer veto rules are well documented here
<https://www.apache.org/foundation/voting.html#Veto>, and there is no
mention of a time limit, I propose we shelve that discussion and work on
getting to consensus on what we as a community want to include in the Flume
1.9 release.

As far as I can tell, the decision we have to make is whether to include
the FLUME-2050 logging changes in the Flume 1.9 release. We can either:

1) Require users upgrading from Flume 1.8 to Flume 1.9 to have to modify
their log4j configuration files to use the different log4j2 XML format,
with the procedure for doing so documented in the release notes.
-or-
2) Defer that change to a future release where incompatible changes are
expected, such as Flume 2.0.

Also, maybe there are other options I haven't thought of...?

I would like to get some input from more people on this matter. How do
others feel about this?

Thanks,
Mike


On Sat, Nov 24, 2018 at 9:36 PM Ralph Goers 
wrote:

> I should also point out that the time to raise this was a year ago when
> the PR for FLUME-2050 was reviewed and committed. As it stands now I could
> be a jerk and vote -1 on the patch for FLUME-3296 with valid technical
> grounds. If this was causing a true binary incompatibility I would approve
> reverting it in a heartbeat, but I just don’t see how having users have to
> change logging configuration is “intolerable”, especially with the known
> security issues in Log4j 1, however unlikely user’s might be to encounter
> them.
>
> That said, I wouldn’t veto the revert on the release branch, but I would
> suggest that the release notes provide fair warning that the next release
> will upgrade the logging dependency.  It would also be nice if releases
> could be more frequent than once a year.
>
> I would also like to say that I’m not doing this for the fun of it. My
> company uses Flume for some of its most critical processing. We run
> Veracode scans on all of our software and I expect Flume would be flagged
> if I hadn’t repacked it with Log4j 2. It also may not show up since the CVE
> is marked against Log4j 2 since Log4j 1 is EOL, but the security scanning
> tools should be flagging that as well.
>
> FWIW I’ve had to hack or enhance a few Flume components to make it work
> for my needs but overall it works really, really well. I’d like to commit
> back some of the changes but the main one - Flume configuration - I really
> don’t like and need to redo the whole thing.
>
> Ralph
>
> > On Nov 24, 2018, at 5:11 PM, Mike Percy  wrote:
> >
> > I wasn't aware of this security issue. Do you have a link to the details?
> >
> > There's no ASF requirement for package names; it's really just a Java
> > language convention, and especially not enforced for compatibility shims.
> > Even Flume has some of those in the legacy sources.
> >
> > I understand that it's very difficult to provide a compatibility layer.
> > Maybe also a boring task. I'm just saying that without log4j1 backwards
> > compatibility provided by log4j2, there will have to be a really critical
> > reason to inflict this migration pain on Flume users -- something that
> > simply can't be tolerated, even in a minor release, like a new and highly
> > dangerous security bug. Without such a motivation, I don't see how this
> > incompatible dependency change can be justified.
> >
> > Mike
> >
> > On Sat, Nov 24, 2018 at 3:20 PM Ralph Goers 
> > wrote:
> >
> >>
> >>> On Nov 24, 2018, at 2:35 PM, Mike Percy  wrote:
> >>>
> >>> Flume has long had a policy of backwards compatibility with its own
> >>> configuration files and people expect things to "just work" when
> >> upgrading
> >>> Flume. If log4j2 can't parse the log4j1 config file format then it's an
> >>> incompatible upgrade and should not be done in a minor Flume release.
> >>>
> >>> If log4j2 wants to be a drop-in replacement for log4j1 then by default
> it
> >>> should find and parse the traditional log4j.properties config files, at
> >>> least as a fallback, rather than force users to convert to the new XML
> >>> format before upgrading.
> >>
> >> That is simply not possible:
> >> 1. Log4j 1 requires the use of fully qualified class names. The package
> >> names log4j 1 used didn’t conform to ASF naming guidelines and don’t
> line
> >> up with the package names used in Log4j2.
> >> 2. Log4j 1 did not delineate between what was public and what was
> private
> >> so there is code all over the place mucking with the internals of Log4j.
> >> Log4j 2 implemented a compatibility layer by imp

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-24 Thread Mike Percy
I wasn't aware of this security issue. Do you have a link to the details?

There's no ASF requirement for package names; it's really just a Java
language convention, and especially not enforced for compatibility shims.
Even Flume has some of those in the legacy sources.

I understand that it's very difficult to provide a compatibility layer.
Maybe also a boring task. I'm just saying that without log4j1 backwards
compatibility provided by log4j2, there will have to be a really critical
reason to inflict this migration pain on Flume users -- something that
simply can't be tolerated, even in a minor release, like a new and highly
dangerous security bug. Without such a motivation, I don't see how this
incompatible dependency change can be justified.

Mike

On Sat, Nov 24, 2018 at 3:20 PM Ralph Goers 
wrote:

>
> > On Nov 24, 2018, at 2:35 PM, Mike Percy  wrote:
> >
> > Flume has long had a policy of backwards compatibility with its own
> > configuration files and people expect things to "just work" when
> upgrading
> > Flume. If log4j2 can't parse the log4j1 config file format then it's an
> > incompatible upgrade and should not be done in a minor Flume release.
> >
> > If log4j2 wants to be a drop-in replacement for log4j1 then by default it
> > should find and parse the traditional log4j.properties config files, at
> > least as a fallback, rather than force users to convert to the new XML
> > format before upgrading.
>
> That is simply not possible:
> 1. Log4j 1 requires the use of fully qualified class names. The package
> names log4j 1 used didn’t conform to ASF naming guidelines and don’t line
> up with the package names used in Log4j2.
> 2. Log4j 1 did not delineate between what was public and what was private
> so there is code all over the place mucking with the internals of Log4j.
> Log4j 2 implemented a compatibility layer by implementing the classes that
> are being used but by and large they don’t actually do anything.
> 3. The Appenders and Filters in Log4j 2 implement different interfaces and
> cannot use components from Log4j 1.
> 4. The Appenders in Log4j 2 are not identical with Log4j 1. In fact, many
> people didn’t use the Rolling File Appender that shipped with Log4j but
> used the one from Log4j extras. So it is hard to.know what Log4j 2 would
> have needed to be compatible with.
>
> Many organizations (including mine) have security requirements that say
> they must use software that is supported with security fixes. Log4j 1 has a
> known security bug that will never be fixed as it reached end-of-life in
> August of 2015. This means the use of Log4j 1 is not acceptable for any
> security conscious organization.
>
> While folks have brought up this issue from time to time most seem to
> adapt quite quickly to the change. Most of the issues are developers
> wanting to know how to make something they were doing in Log4j 1 work in
> Log4j 2.
>
> As you can imagine, since Log4j 2 has been GA now for 4 1/2 years and
> Log4j 1 has been EOL for over 3, making the configuration compatible with
> Log4j 1 is not a high priority. FWIW, there was an effort to do that a few
> years ago, and while we were able to get some basic stuff to work making it
> work in a general way wasn’t possible.
>
> Ralph
>
> >
> > Mike
> >
> > On Fri, Nov 23, 2018 at 9:39 AM Ralph Goers 
> > wrote:
> >
> >> No, you are correct. However, requiring a change to logging
> configuration
> >> has never been considered a binary compatibility break in any project I
> >> have ever worked on.
> >>
> >> Ralph
> >>
> >>> On Nov 23, 2018, at 10:05 AM, Ferenc Szabo  >
> >> wrote:
> >>>
> >>> As you mentioned, you have the freedom to use Log4j 2 and the same time
> >> we
> >>> have to keep the out of the box experience the same in a minor version.
> >>> Users should be able to upgrade flume without changing any of their
> >>> configurations.
> >>> If they have a log4j.properties (Log4j 1) then they would not be able
> to
> >>> use it after the upgrade without changing it.
> >>>
> >>> Or am I missing a feature that would solve this case?
> >>>
> >>> On Fri, Nov 23, 2018 at 5:49 PM Ralph Goers <
> ralph.go...@dslextreme.com>
> >>> wrote:
> >>>
> >>>> Also, please put details in the Jira issues. It is much easier to find
> >> out
> >>>> why something was done by searching Jira later on then searching the
> >>>> mailing list.
> >>>>
> >>>> Ralph
> >>>>
> >>>

Re: [DISCUSS] Flume 1.9 release proposal

2018-11-24 Thread Mike Percy
Flume has long had a policy of backwards compatibility with its own
configuration files and people expect things to "just work" when upgrading
Flume. If log4j2 can't parse the log4j1 config file format then it's an
incompatible upgrade and should not be done in a minor Flume release.

If log4j2 wants to be a drop-in replacement for log4j1 then by default it
should find and parse the traditional log4j.properties config files, at
least as a fallback, rather than force users to convert to the new XML
format before upgrading.

Mike

On Fri, Nov 23, 2018 at 9:39 AM Ralph Goers 
wrote:

> No, you are correct. However, requiring a change to logging configuration
> has never been considered a binary compatibility break in any project I
> have ever worked on.
>
> Ralph
>
> > On Nov 23, 2018, at 10:05 AM, Ferenc Szabo 
> wrote:
> >
> > As you mentioned, you have the freedom to use Log4j 2 and the same time
> we
> > have to keep the out of the box experience the same in a minor version.
> > Users should be able to upgrade flume without changing any of their
> > configurations.
> > If they have a log4j.properties (Log4j 1) then they would not be able to
> > use it after the upgrade without changing it.
> >
> > Or am I missing a feature that would solve this case?
> >
> > On Fri, Nov 23, 2018 at 5:49 PM Ralph Goers 
> > wrote:
> >
> >> Also, please put details in the Jira issues. It is much easier to find
> out
> >> why something was done by searching Jira later on then searching the
> >> mailing list.
> >>
> >> Ralph
> >>
> >>> On Nov 23, 2018, at 9:47 AM, Ralph Goers 
> >> wrote:
> >>>
> >>> I do not understand this at all. Log4j 2 provides runtime compatibility
> >> with Log4j 1. What is the problem that requires a revert?
> >>>
> >>> I have been running Flume with Log4j 2 since 1.6 so I don’t understand
> >> what the problem could possibly be.
> >>>
> >>> Ralph
> >>>
>  On Nov 23, 2018, at 8:50 AM, Ferenc Szabo 
> >> wrote:
> 
>  Hi everyone
> 
>  I am about to branch the 1.9 release from trunk.
> 
>  On the 1.9 branch we will revert the following breaking changes:
>  - FLUME-2957. Remove Guava from our public API:
> 
> 
> >>
> https://github.com/apache/flume/commit/7f85df9e473ee675d461d5b76650694c5a6c0088
>  - part of FLUME-2050. Upgrade to Log4j 2.10.0:
>  as the new release should work with the previous configurations we
> have
>  to release it with log4j 1.x
>  For the log4j2 upgrade, we will provide a guide, how to replace the
> jars
>  if users would like to start using it in the 1.9 release on the wiki
> >> page.
> 
>  Because of these changes the first release candidate might be
> postponed
> >> to
>  Monday.
> 
>  Regards,
>  Ferenc Szabo
> 
>  On Wed, Nov 7, 2018 at 5:04 PM ema...@cloudera.com <
> ema...@cloudera.com
> >>>
>  wrote:
> 
> > Hi Ferenc,
> >
> > +1
> > I am working on FLUME-3281 Update to Kafka 2.0 client, should be able
> >> to
> > finish it
> > till the suggested deadline.
> > I am also happy to do some reviews.
> >
> > Regards
> > Endre
> >
> > On 2018/11/06 21:23:17, Ferenc Szabo  wrote:
> >> Hello Flume Community,
> >>
> >> 1.8 was released about a year ago and since that quite a few bug
> >> fixes,
> >> improvements, features and documentation were introduced.
> >> I would like to propose to publish the next minor release of Flume
> >> to make these changes available to the users.
> >>
> >> I would be more than happy to be the Release Manager with the help
> of
> >> Denes Arvay for anything that requires PMC access - if both the
> >> community
> >> and he are
> >> OK with it.
> >>
> >> Among others the following changes will be included in the next
> >> release:
> >>
> >> Fixed bugs:
> >> - FLUME-3117 Application can be dead loop when call System.exit() in
> >> methodconfigure
> >> - FLUME-3237 Handling RuntimeExceptions coming from the JMS provider
> >> in
> >> JMSSource
> >> - FLUME-3201 Fix SyslogUtil to handle RFC3164 format in december
> > correctly
> >> - FLUME-3056 TestApplication hangs indefinitely
> >> - FLUME-2976 Exception when JMS source tries to connect to a
> Weblogic
> >> server without authentication
> >> - FLUME-3270 Close JMS resources in JMSMessageConsumer constructor
> in
> > case
> >> of failure
> >> - FLUME-3222 java.nio.file.NoSuchFileException thrown when files are
> > being
> >> deleted from the TAILDIR source
> >> - FLUME-2894 Flume components should stop in the correct order
> >> (graceful
> >> shutdown)
> >> - FLUME-2973 Deadlock in hdfs sink
> >> - FLUME-3278 Handling -D keystore parameters in Kafka components
> >> - FLUME-3265 Cannot set batch-size for LoadBalancingRpcClient
> >>
> >>
> >> Improvements:
> >> - FLUME-3186 Make asyncHbaseClient configuration 

Re: What do we do with Integrations with no maintainer?

2018-11-19 Thread Mike Percy
I agree with the above... the JAR hell we are currently in seems mostly
unresolvable and shedding dependencies seems like a reasonable choice
regardless of whether we implement an isolated classloader or not.

Mike

On Mon, Nov 19, 2018 at 6:02 AM Ferenc Szabo  wrote:

> Hi all,
>
> I somewhat agree with Helmut.
> I think we should detach every component from the framework and provide an
> isolated classloader for them to avoid the dependency issues we have now.
>
> Basically, flume itself should be just the framework and every
> source/sink/channel/interceptor/etc would come as a plugin.
> We would have "official" plugins in the flume repository but as separate
> maven projects.
>
> After the 1.9 release, we could start planning it for 2.0 as it would be
> easiest (or only possible) to do with breaking changes, however, it would
> be good to aim for compatibility with 1.x plugins.
>
>  What do you think about that?
>
> Additionally, I would be happy to maintain a plugin list for flume with the
> 3rd party/community plugins.
>
> Regards,
>
> Ferenc
>
>
> On Mon, Nov 19, 2018 at 1:19 PM Wahrmann, Helmut 
> wrote:
>
> > Hi all,
> >
> > What are we doing with integrations having no maintainer?
> >
> > An example is the morphline sink. It supports Solr 4.3 and Apache has
> > already relased Solr 7.5.0.
> > Kite SDK is at 1.1.
> > Seems that no one is taking care of it.
> >
> > On the other site we are still "supporting" Elasticsearch 0.90.1, while
> > ElasticSearch is already on 6.5.
> > Having a dependency of Lucene between Solr and Elastic I cannot push my
> > changes to flume.
> >
> > So while there would be a maintainer for ElasticSearch, new features
> > cannot be introduced, because of the above.
> >
> > Shouldn't we deprecate old stuff, where no maintainer is active?
> > If 1.9 is released with support of outdated SOLR and ElasticSearch, no
> one
> > would use Flume anyhow.
> > I doubt that someone would downgrade the ElasticSearch cluster to 0.90
> > just to be able to pump in events via Flume.
> >
> > I keep a working ElasticSearch 6.x Fork anyhow, but would like to
> > contribute back to the project.
> >
> > Note: I have tried to upgrade SOLR support to the newest version, but it
> > fails on the test cases and I don't have anything to test.
> >
> >
> > Thanks,
> >
> > Helmut
> >
> >
>


Minor reface to the Apache Flume project web site

2018-07-25 Thread Mike Percy
Hi all,
I went to add the issues@ list to the mailing list page on
http://flume.apache.org and remembered that I didn't care for some of the
fonts and colors on the site theme, so I updated the CSS to be prettier
while I was in there. Now it's more "nautical", in line with the color
scheme of the Flume logo. I didn't actually change the layout or the
templating engine.

I committed my web site changes directly (skipping review) since web site
issues are easy to go back and fix on the fly. If there are any wrapping or
rendering problems please let me know and I'll see if I can fix them.

To see the changes, you might have to shift-refresh to get the new CSS file
if your browser has the old one cached.

Mike


Re: Merge of patch in Flume-3021?

2018-04-26 Thread Mike Percy
Hi Helmut, yes I started investigating but I haven't gotten to the bottom
of the issue yet. I went out of town on holiday but I'll be back next week
so I will take another look soon.

Mike

On Tue, Apr 17, 2018 at 6:52 PM, Wahrmann, Helmut <helmut.wahrm...@rsa.com>
wrote:

> Hi Mike,
>
> Did you have a chance to look into this?
>
> thanks,
>
> Helmut
>
>
> -----Original Message-
> From: Mike Percy [mailto:mpe...@apache.org]
> Sent: Montag, 26. März 2018 21:46
> To: dev@flume.apache.org
> Subject: Re: Merge of patch in Flume-3021?
>
> I haven't figured this out yet but I'll look into it this week.
>
> Mike
>
> On Mon, Mar 19, 2018 at 2:41 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com
> >
> wrote:
>
> > Hi Mike,
> >
> > With the help of Ferenc, I got rid of the initial errors.
> >
> > Only one is remaining now:
> >
> > [INFO] ---
> > [INFO]  T E S T S
> > [INFO] ---
> > [INFO] Running
> > org.apache.flume.sink.solr.morphline.TestBlobDeserializer
> > [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 1.126 s - in org.apache.flume.sink.solr.morphline.TestBlobDeserializer
> > [INFO] Running org.apache.flume.sink.solr.morphline.TestBlobHandler
> > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 1.125 s - in org.apache.flume.sink.solr.morphline.TestBlobHandler
> > [INFO] Running org.apache.flume.sink.solr.morphline.
> > TestMorphlineInterceptor
> > [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 7.724 s - in
> > org.apache.flume.sink.solr.morphline.TestMorphlineInterceptor
> > [INFO] Running
> > org.apache.flume.sink.solr.morphline.TestMorphlineSolrSink
> > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> > 1.797 s <<< FAILURE! - in org.apache.flume.sink.solr.morphline.
> > TestMorphlineSolrSink
> > [ERROR] org.apache.flume.sink.solr.morphline.TestMorphlineSolrSink
> > Time
> > elapsed: 1.797 s  <<< ERROR!
> > java.util.IllformedLocaleException: Invalid subtag: en_us [at index 0]
> >
> > [INFO] Running
> > org.apache.flume.sink.solr.morphline.TestUUIDInterceptor
> > [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> > 0.391 s - in org.apache.flume.sink.solr.morphline.TestUUIDInterceptor
> > [INFO]
> > [INFO] Results:
> > [INFO]
> > [ERROR] Errors:
> > [ERROR]   TestMorphlineSolrSink>LuceneTestCase.localeForLanguageTag:1588
> > ╗ IllformedLocale
> > [INFO]
> > [ERROR] Tests run: 17, Failures: 0, Errors: 1, Skipped: 0 [INFO]
> > [INFO]
> > --
> > --
> > [INFO] BUILD FAILURE
> >
> >
> > The LuceneTestCase is extended by SolrTestCaseJ4.
> > No idea, what I could do against this.
> >
> > best regards,
> >
> > Helmut
> >
> > -Original Message-
> > From: Mike Percy [mailto:mpe...@apache.org]
> > Sent: Montag, 19. März 2018 05:12
> > To: dev@flume.apache.org
> > Subject: Re: Merge of patch in Flume-3021?
> >
> > Nice! Thanks Ferenc. Answered better than I could have done. :)
> >
> > Sorry, I meant to send this last week but I just found it in my drafts.
> >
> > Helmut, please let us know if you need more help with this.
> >
> > Mike
> >
> > On Tue, Mar 6, 2018 at 7:25 AM, Ferenc Szabo <fsz...@cloudera.com>
> wrote:
> >
> > > the createJetty method of a Test class became final in the new
> > > versions of solr.
> > > the kite sdk test-jar has to be removed because it depends on a
> > > different incompatible version of solr
> > >
> > > 
> > >   org.kitesdk
> > >   kite-morphlines-solr-core
> > >   ${kite.version}
> > >   test-jar
> > >   test
> > > 
> > >
> > > TestEnvironment.java has to be removed as well because it depends on
> > > the incompatible dependency
> > >
> > > then we need this class:
> > > https://github.com/kite-sdk/kite/blob/master/kite-morphlines
> > > /kite-morphlines-solr-core/src/test/java/org/kitesdk/
> > > morphline/solr/TestEmbeddedSolrServer.java
> > > I believe it is ok to have a copy of this because it is part of the
> > > incompatible test dependency we just removed
> > >
> > > then we need a newer version of commons

Re: Merge of patch in Flume-3021?

2018-03-26 Thread Mike Percy
I haven't figured this out yet but I'll look into it this week.

Mike

On Mon, Mar 19, 2018 at 2:41 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com>
wrote:

> Hi Mike,
>
> With the help of Ferenc, I got rid of the initial errors.
>
> Only one is remaining now:
>
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.flume.sink.solr.morphline.TestBlobDeserializer
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 1.126 s - in org.apache.flume.sink.solr.morphline.TestBlobDeserializer
> [INFO] Running org.apache.flume.sink.solr.morphline.TestBlobHandler
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 1.125 s - in org.apache.flume.sink.solr.morphline.TestBlobHandler
> [INFO] Running org.apache.flume.sink.solr.morphline.
> TestMorphlineInterceptor
> [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 7.724 s - in org.apache.flume.sink.solr.morphline.TestMorphlineInterceptor
> [INFO] Running org.apache.flume.sink.solr.morphline.TestMorphlineSolrSink
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 1.797 s <<< FAILURE! - in org.apache.flume.sink.solr.morphline.
> TestMorphlineSolrSink
> [ERROR] org.apache.flume.sink.solr.morphline.TestMorphlineSolrSink  Time
> elapsed: 1.797 s  <<< ERROR!
> java.util.IllformedLocaleException: Invalid subtag: en_us [at index 0]
>
> [INFO] Running org.apache.flume.sink.solr.morphline.TestUUIDInterceptor
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 0.391 s - in org.apache.flume.sink.solr.morphline.TestUUIDInterceptor
> [INFO]
> [INFO] Results:
> [INFO]
> [ERROR] Errors:
> [ERROR]   TestMorphlineSolrSink>LuceneTestCase.localeForLanguageTag:1588
> ╗ IllformedLocale
> [INFO]
> [ERROR] Tests run: 17, Failures: 0, Errors: 1, Skipped: 0 [INFO] [INFO]
> 
> [INFO] BUILD FAILURE
>
>
> The LuceneTestCase is extended by SolrTestCaseJ4.
> No idea, what I could do against this.
>
> best regards,
>
> Helmut
>
> -Original Message-
> From: Mike Percy [mailto:mpe...@apache.org]
> Sent: Montag, 19. März 2018 05:12
> To: dev@flume.apache.org
> Subject: Re: Merge of patch in Flume-3021?
>
> Nice! Thanks Ferenc. Answered better than I could have done. :)
>
> Sorry, I meant to send this last week but I just found it in my drafts.
>
> Helmut, please let us know if you need more help with this.
>
> Mike
>
> On Tue, Mar 6, 2018 at 7:25 AM, Ferenc Szabo <fsz...@cloudera.com> wrote:
>
> > the createJetty method of a Test class became final in the new
> > versions of solr.
> > the kite sdk test-jar has to be removed because it depends on a
> > different incompatible version of solr
> >
> > 
> >   org.kitesdk
> >   kite-morphlines-solr-core
> >   ${kite.version}
> >   test-jar
> >   test
> > 
> >
> > TestEnvironment.java has to be removed as well because it depends on
> > the incompatible dependency
> >
> > then we need this class:
> > https://github.com/kite-sdk/kite/blob/master/kite-morphlines
> > /kite-morphlines-solr-core/src/test/java/org/kitesdk/
> > morphline/solr/TestEmbeddedSolrServer.java
> > I believe it is ok to have a copy of this because it is part of the
> > incompatible test dependency we just removed
> >
> > then we need a newer version of commons-compress:
> > 1.10
> >
> > from here you can fix the actual solr related test errors :)
> >
> >
> >
> > On Tue, Mar 6, 2018 at 11:56 AM, Wahrmann, Helmut
> > <helmut.wahrm...@rsa.com
> > >
> > wrote:
> >
> > > Hi Ferenc,
> > >
> > > Thanks for offering help.
> > > In agreement with Mike I want to have support for Solr 7.2.1 in the
> > > morphline solr sink, so that we can easily upgrade the Elasticsearch
> > sink.
> > >
> > > My updates are here: https://github.com/hwahrmann/
> > > flume/tree/Upgrade_Morphline_Sink
> > >
> > > I changed the solr version to 7.2.1 and was able to compile the sink
> > > withput any problems.
> > > I can also compile the tests, but when running, I get multiple
> > > errors
> > like
> > > this:
> > >
> > > [INFO] Running org.apache.flume.sink.solr.morphline.
> > > TestMorphlineInterceptor
> > > [ERROR] Tests run: 

Re: Merge of patch in Flume-3021?

2018-03-18 Thread Mike Percy
Nice! Thanks Ferenc. Answered better than I could have done. :)

Sorry, I meant to send this last week but I just found it in my drafts.

Helmut, please let us know if you need more help with this.

Mike

On Tue, Mar 6, 2018 at 7:25 AM, Ferenc Szabo <fsz...@cloudera.com> wrote:

> the createJetty method of a Test class became final in the new versions of
> solr.
> the kite sdk test-jar has to be removed because it depends on a different
> incompatible version of solr
>
> 
>   org.kitesdk
>   kite-morphlines-solr-core
>   ${kite.version}
>   test-jar
>   test
> 
>
> TestEnvironment.java has to be removed as well because it depends on the
> incompatible dependency
>
> then we need this class:
> https://github.com/kite-sdk/kite/blob/master/kite-morphlines
> /kite-morphlines-solr-core/src/test/java/org/kitesdk/
> morphline/solr/TestEmbeddedSolrServer.java
> I believe it is ok to have a copy of this because it is part of the
> incompatible test dependency we just removed
>
> then we need a newer version of commons-compress:
> 1.10
>
> from here you can fix the actual solr related test errors :)
>
>
>
> On Tue, Mar 6, 2018 at 11:56 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com
> >
> wrote:
>
> > Hi Ferenc,
> >
> > Thanks for offering help.
> > In agreement with Mike I want to have support for Solr 7.2.1 in the
> > morphline solr sink, so that we can easily upgrade the Elasticsearch
> sink.
> >
> > My updates are here: https://github.com/hwahrmann/
> > flume/tree/Upgrade_Morphline_Sink
> >
> > I changed the solr version to 7.2.1 and was able to compile the sink
> > withput any problems.
> > I can also compile the tests, but when running, I get multiple errors
> like
> > this:
> >
> > [INFO] Running org.apache.flume.sink.solr.morphline.
> > TestMorphlineInterceptor
> > [ERROR] Tests run: 66, Failures: 0, Errors: 66, Skipped: 0, Time elapsed:
> > 24.438 s <<< FAILURE! - in org.apache.flume.sink.solr.morphline.
> > TestMorphlineInterceptor
> > [ERROR] testIfDetectMimeTypeRouteToNorthPole(org.apache.flume.sink.
> > solr.morphline.TestMorphlineInterceptor)  Time elapsed: 1.985 s  <<<
> > ERROR!
> > java.lang.VerifyError: class org.kitesdk.morphline.solr.Abs
> tractSolrMorphlineZkTest
> > overrides final method createJetty.(Ljava/io/File;
> > Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;
> > Ljava/lang/String;)Lorg/apache/solr/client/solrj/embedded/
> JettySolrRunner;
> > at org.apache.flume.sink.solr.morphline.TestMorphlineIntercepto
> r.
> > build(TestMorphlineInterceptor.java:151)
> > at org.apache.flume.sink.solr.morphline.TestMorphlineIntercepto
> r.
> > testIfDetectMimeTypeRouteToNorthPole(TestMorphlineInterceptor.java:139)
> >
> > [ERROR] testGrokIfNotMatchDropEventRetain(org.apache.flume.sink.
> > solr.morphline.TestMorphlineInterceptor)  Time elapsed: 0.363 s  <<<
> > ERROR!
> > java.lang.VerifyError: class org.kitesdk.morphline.solr.Abs
> tractSolrMorphlineZkTest
> > overrides final method createJetty.(Ljava/io/File;
> > Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;
> > Ljava/lang/String;)Lorg/apache/solr/client/solrj/embedded/
> JettySolrRunner;
> > at org.apache.flume.sink.solr.morphline.TestMorphlineIntercepto
> r.
> > build(TestMorphlineInterceptor.java:151)
> > at org.apache.flume.sink.solr.morphline.TestMorphlineIntercepto
> r.
> > testGrokIfNotMatchDropEventRetain(TestMorphlineInterceptor.java:83)
> >
> >
> > The all have problems with createJetty. So it seems that I maybe need a
> > different version of jetty or something like that.
> > And for that I have too less knowledge about maven.
> >
> > thx,
> >
> > Helmut
> >
> > -Original Message-
> > From: Ferenc Szabo [mailto:fsz...@cloudera.com]
> > Sent: Dienstag, 6. März 2018 11:04
> > To: dev@flume.apache.org
> > Subject: Re: Merge of patch in Flume-3021?
> >
> > Hi Helmut,
> >
> > let me know what can I help You with.
> > share your current code on a github fork and describe the issue. I will
> > see what can we do to solve it.
> >
> >
> > On Tue, Mar 6, 2018 at 10:50 AM, Wahrmann, Helmut <
> helmut.wahrm...@rsa.com
> > >
> > wrote:
> >
> > > Hi Mike,
> > >
> > > I am stuck with Solr.
> > > The morphline-solr sink compiles without any problems, but I am
> > > struggling with the tests.
> > > Seems I need to exclude some st

Re: Merge of patch in Flume-3021?

2018-02-14 Thread Mike Percy
Hi Helmut,
As long as the integration tests still pass and the packaging issues are
not exacerbated, I don't see why we couldn't merge an upgrade patch,
barring any serious concerns with the patch.

Mike

On Wed, Feb 14, 2018 at 1:47 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com>
wrote:

> Hi Mike,
>
> I won't have a problem upgrading the Solr sink to the latest version.
> I am missing test environment however.
> So while it may build correctly and all integration tests work, I have no
> real environment to test with.
>
> best regards,
> Helmut
>
> -Original Message-
> From: Mike Percy [mailto:mpe...@apache.org]
> Sent: Mittwoch, 14. Februar 2018 00:38
> To: dev@flume.apache.org
> Subject: Re: Merge of patch in Flume-3021?
>
> OK. In the pull request, it would be nice if whoever submits or merges it
> mentions all of the contributors to the patch in the commit message.
>
> I asked Wolfgang H. about the SolrServer thing and this is what he told me:
>
> Hi Mike, the class has been renamed to "SolrClient" (which unfortunately
> > breaks compat). It's just a class rename. The functionality is the
> > same as before. It was called SolrServer in Solr4 because it was a
> > client proxy that sends RPCs to a Solr server, but calling it
> > SolrClient is more straightforward to understand, hence the community
> > decided to rename the class.
> > It's possible to spawn an embedded Solr server, for example for
> > testing purposes, via class EmbeddedSolrServer (a class that retains
> > the same name in Solr7 and Solr4), which extends the SolrClient class.
>
>
> Hope this helps,
> Mike
>
> On Tue, Feb 13, 2018 at 4:41 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com
> >
> wrote:
>
> > Hi Mike,
> >
> > Thanks for the response. Would be cool if we get that sorted out.
> >
> > I've asked Yonghao Zou to submit the Pull Request, since he did most
> > of the work and should get the credit.
> > He'll  do so after the Chinese New Year's Eve.
> >
> > I will then issue a Pull request for the new ES Rest client, which is
> > dependent on the above work.
> >
> > best regards,
> > Helmut
> >
> > -Original Message-
> > From: Mike Percy [mailto:mpe...@apache.org]
> > Sent: Dienstag, 13. Februar 2018 04:30
> > To: dev@flume.apache.org
> > Subject: Re: Merge of patch in Flume-3021?
> >
> > Hi Helmut,
> > I see that I neglected to follow up on the other thread on this topic
> > after your reply about SolrServer missing from the solrj jar. Let me
> > ask around w/ some folks I know that work on Solr and see if there is
> > any way to retain the SolrServer for our tests after upgrading to the
> new version.
> >
> > Thank you very much for working on upgrading Solr. Would you mind
> > submitting a pull request with your (apparently work-in-progress)
> > patch to upgrade both Solr and ES?
> >
> > To reply to your email in this thread, the JAR packaging situation is
> > largely the same after merging FLUME-2957 so unfortunately most of
> > what I noted in my reply in the other thread (
> > https://s.apache.org/GqcX ) still holds.
> >
> > I hope that we can upgrade the Solr dependencies as part of the same
> > commit as the ES dependencies to avoid worrying about which lucene jar
> > is first in the classpath, and ensure we are not adding any additional
> > dependency conflicts to mvn dependency:tree.
> >
> > Regards,
> > Mike
> >
> > On Mon, Feb 12, 2018 at 12:54 AM, Wahrmann, Helmut <
> > helmut.wahrm...@rsa.com>
> > wrote:
> >
> > > Hi,
> > >
> > > now that the blocker for FLUME-3021 is removed by committing
> > > FLUME-2957, can we get the patch from 3021 merged to trunk?
> > >
> > > Thanks,
> > >
> > > Helmut
> > >
> >
>


Re: Merge of patch in Flume-3021?

2018-02-13 Thread Mike Percy
OK. In the pull request, it would be nice if whoever submits or merges it
mentions all of the contributors to the patch in the commit message.

I asked Wolfgang H. about the SolrServer thing and this is what he told me:

Hi Mike, the class has been renamed to "SolrClient" (which unfortunately
> breaks compat). It's just a class rename. The functionality is the same as
> before. It was called SolrServer in Solr4 because it was a client proxy
> that sends RPCs to a Solr server, but calling it SolrClient is more
> straightforward to understand, hence the community decided to rename the
> class.
> It's possible to spawn an embedded Solr server, for example for testing
> purposes, via class EmbeddedSolrServer (a class that retains the same name
> in Solr7 and Solr4), which extends the SolrClient class.


Hope this helps,
Mike

On Tue, Feb 13, 2018 at 4:41 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com>
wrote:

> Hi Mike,
>
> Thanks for the response. Would be cool if we get that sorted out.
>
> I've asked Yonghao Zou to submit the Pull Request, since he did most of
> the work and should get the credit.
> He'll  do so after the Chinese New Year's Eve.
>
> I will then issue a Pull request for the new ES Rest client, which is
> dependent on the above work.
>
> best regards,
> Helmut
>
> -Original Message-
> From: Mike Percy [mailto:mpe...@apache.org]
> Sent: Dienstag, 13. Februar 2018 04:30
> To: dev@flume.apache.org
> Subject: Re: Merge of patch in Flume-3021?
>
> Hi Helmut,
> I see that I neglected to follow up on the other thread on this topic
> after your reply about SolrServer missing from the solrj jar. Let me ask
> around w/ some folks I know that work on Solr and see if there is any way
> to retain the SolrServer for our tests after upgrading to the new version.
>
> Thank you very much for working on upgrading Solr. Would you mind
> submitting a pull request with your (apparently work-in-progress) patch to
> upgrade both Solr and ES?
>
> To reply to your email in this thread, the JAR packaging situation is
> largely the same after merging FLUME-2957 so unfortunately most of what I
> noted in my reply in the other thread ( https://s.apache.org/GqcX ) still
> holds.
>
> I hope that we can upgrade the Solr dependencies as part of the same
> commit as the ES dependencies to avoid worrying about which lucene jar is
> first in the classpath, and ensure we are not adding any additional
> dependency conflicts to mvn dependency:tree.
>
> Regards,
> Mike
>
> On Mon, Feb 12, 2018 at 12:54 AM, Wahrmann, Helmut <
> helmut.wahrm...@rsa.com>
> wrote:
>
> > Hi,
> >
> > now that the blocker for FLUME-3021 is removed by committing
> > FLUME-2957, can we get the patch from 3021 merged to trunk?
> >
> > Thanks,
> >
> > Helmut
> >
>


Re: Store password for config safely?

2018-02-13 Thread Mike Percy
I think Ferenc has been looking at something related to this, or perhaps is
trying to get an existing patch merged (FLUME-2442
, PR 197
). I haven't been following that
work closely so I don't know if it's exactly what you're looking for, but
maybe he can chime in here.

Mike

On Mon, Feb 12, 2018 at 1:16 AM, Wahrmann, Helmut 
wrote:

> Hi,
>
> Do we have a way of storing a password safely, i.e. not in clear text?
> When e.g. an Elasticsearch cluster is protected by X-Pack Security, I need
> to specify a userid / password when connecting.
> The userid / password could be specified in the config, but then the
> password would be available in readable form.
>
> Do we have other sinks or sources, where we are dealing with passwords and
> were a suitable method exists?
>
> best regards,
>
> Helmut
>


Re: Merge of patch in Flume-3021?

2018-02-12 Thread Mike Percy
Hi Helmut,
I see that I neglected to follow up on the other thread on this topic after
your reply about SolrServer missing from the solrj jar. Let me ask around
w/ some folks I know that work on Solr and see if there is any way to
retain the SolrServer for our tests after upgrading to the new version.

Thank you very much for working on upgrading Solr. Would you mind
submitting a pull request with your (apparently work-in-progress) patch to
upgrade both Solr and ES?

To reply to your email in this thread, the JAR packaging situation is
largely the same after merging FLUME-2957 so unfortunately most of what I
noted in my reply in the other thread ( https://s.apache.org/GqcX ) still
holds.

I hope that we can upgrade the Solr dependencies as part of the same commit
as the ES dependencies to avoid worrying about which lucene jar is first in
the classpath, and ensure we are not adding any additional dependency
conflicts to mvn dependency:tree.

Regards,
Mike

On Mon, Feb 12, 2018 at 12:54 AM, Wahrmann, Helmut 
wrote:

> Hi,
>
> now that the blocker for FLUME-3021 is removed by committing FLUME-2957,
> can we get the patch from 3021 merged to trunk?
>
> Thanks,
>
> Helmut
>


Re: Breaking changes in Flume - 2.0 release

2018-02-05 Thread Mike Percy
Based on the above, it seems that we have consensus to move forward with an
ABI-breaking change and a Flume 2.0 release.

I'm going to branch from the current trunk as flume-1.x and we can continue
committing to trunk assuming that the next release from trunk will be the
2.0.0 release.

I'm also going to +1 Denes' change @ https://github.com/apache/
flume/pull/195

Mike

On Tue, Jan 30, 2018 at 2:32 PM, Mike Percy <mpe...@apache.org> wrote:

> On Tue, Jan 30, 2018 at 11:20 AM, Hari Shreedharan <
> hshreedha...@apache.org> wrote:
>
>> Agreed. Another change we might want to consider is shading the
>> dependencies of individual modules and the framework itself. This will
>> make
>> it easier to upgrade individual modules.
>>
>
> +1, I don't know to what extent this is possible but if so then I'm all
> for it.
>
> Mike
>
>


Re: Breaking changes in Flume - 2.0 release

2018-01-30 Thread Mike Percy
On Tue, Jan 30, 2018 at 11:20 AM, Hari Shreedharan 
wrote:

> Agreed. Another change we might want to consider is shading the
> dependencies of individual modules and the framework itself. This will make
> it easier to upgrade individual modules.
>

+1, I don't know to what extent this is possible but if so then I'm all for
it.

Mike


Re: Merging to trunk?

2018-01-29 Thread Mike Percy
On Mon, Jan 29, 2018 at 10:48 AM, Wahrmann, Helmut  wrote:
>
> I think it is not a problem to distribute the modules together if we have
> maintainers for them.
> It doesn't make sense to release Flume 1.9 in 2018 with support for module
> versions which were current in 2013.


> Best way would be to identify people willing to upgrade the modules to
> their latest version.
> If we can't find someone, we should distribute it separately.


I would be OK with upgrading Solr and ES simultaneously if we can do it in
a compatible way. However, there is no guarantee that the latest versions
of Solr and ES will be dependency-compatible. If they are, I would
certainly be in favor of merging such a patch.

On Mon, Jan 29, 2018 at 2:49 AM, Ferenc Szabo  wrote:

> I believe, that a clean and maintainable solution would be if the
> engine/framework itself could be separated from everything else,
> the dependency directions would be fixed, plugins would depend only on
> api-s  and after that, any source/sink/etc implementation would come as an
> individual module/plugin and would be loaded with an isolated classloader.
>

I agree that isolating Flume plugins from each other would solve this
problem once and for all. I wonder if we can do both things: come up with a
short-term "band-aid" patch to allow us to upgrade ES / Solr and also come
up with a longer-term plan to solve the underlying problem.

Mike


Re: Breaking changes in Flume - 2.0 release

2018-01-29 Thread Mike Percy
I think this change is important enough to warrant a breaking version bump.
Virtually every project depends on Guava, and Flume should certainly
shade/rename Guava for its internal use.

I'd be on board with doing a Flume 2.0 to remove Guava from the public API.

Mike

On Mon, Jan 29, 2018 at 2:31 AM, Denes Arvay  wrote:

> Hi Flume Community,
>
> It has come up a couple of times that we should fully remove (or at least
> shade) Guava from Flume, but unfortunately it's part of our public API
> [1,2,3].
> As a first step I worked on FLUME-2957 [4] and created a pull request [5].
> Mike has already reviewed it, thanks for that. As he pointed out it is a
> breaking change thus it can be included in Flume 2.0 release only.
>
> Flume 2.0 will be a good opportunity to introduce other breaking changes as
> well, so we might want to start collecting the features, improvements,
> other possible changes.
>
> What do you think?
>
> Best,
> Denes
>
> [1]
> https://github.com/apache/flume/blob/trunk/flume-ng-
> configuration/src/main/java/org/apache/flume/Context.java#L51
> [2]
> https://github.com/apache/flume/blob/trunk/flume-ng-
> node/src/main/java/org/apache/flume/node/MaterializedConfiguration.
> java#L41
> [3]
> https://github.com/apache/flume/blob/trunk/flume-ng-
> node/src/main/java/org/apache/flume/node/SimpleMaterializedConfiguratio
> n.java#L64
> [4] https://issues.apache.org/jira/browse/FLUME-2957
> [5] https://github.com/apache/flume/pull/195
>


Re: Merging to trunk?

2018-01-29 Thread Mike Percy
Interesting proposal Yonghao.

I think it's worth pointing out that today, a Flume build will work "out of
the box" with any combination of supported plugins since they are all
already in the Flume classpath. The downside of that, of course, is that
the dependency hell makes it very hard to upgrade modules. The benefit is
that it's nice from a usability perspective.

However, if we are willing to drop this "out of the box" usage feature,
then we could allow for separate modules to have conflicting dependencies
(i.e. Solr and ElasticSearch) as long as they are not loaded in the same
Flume agent.

What do you folks think about removing the Solr Sink and the ElasticSearch
Sink from the default distribution and instead distribute them as separate
(fat) jars? If we do it for those two then it might also make sense to do
it for other modules as well.

I am not sure this is very user friendly in the common case where people
just want to use the HDFS and HBase sink out of the box. So maybe we leave
those as part of the "default" module setup?

Obviously, changing how modules get included into the classpath would be a
breaking change. I think a change that drastic would call for a Flume 2.0.0
release in order to indicate that it's not backwards compatible.

Thoughts?

Mike

On Sat, Jan 27, 2018 at 10:50 PM, Yonghao Zou <yonghaoz1...@gmail.com>
wrote:

> Maybe the Maven shade plugin can solve this, we can package each module to
> a fat jar and they will not depend on other modules.
>
> 2018-01-27 6:14 GMT+08:00 Wahrmann, Helmut <helmut.wahrm...@rsa.com>:
>
> >  fully agreed.
> > But we shouldn't do that do support a 4 years old component.
> > The goal should be to support the latest releases.
> > It is ridiculous to release flume 1.9 with elasticsearch 0.90.1 support,
> > when 6.x is the current version.
> >
> >
> >
> > Best regards,
> >
> > Helmut Wahrmann
> >
> >
> >  Ursprüngliche Nachricht 
> > Von: Ralph Goers <ralph.go...@dslextreme.com>
> > Datum: 26.01.18 22:12 (GMT+01:00)
> > An: dev@flume.apache.org
> > Betreff: Re: Merging to trunk?
> >
> > The “right” way to deal with this from a Maven perspective is to declare
> > the version you want in the dependencyManagement section of the parent
> pom
> > and to not specify any versions in child poms. Then use the Maven
> enforcer
> > plugin to make sure everything required has a declaration in the
> > dependencyManagement section.
> >
> > Ralph
> >
> > > On Jan 26, 2018, at 8:28 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com
> >
> > wrote:
> > >
> > > Hi Mike,
> > >
> > > This only shows that some other component needs to be updated as well.
> > > Which software relies on Lucene 4.3.0, which was released in 2013?
> > > It is most probably not used by anyone else anymore.
> > >
> > > This needs to be solved. It makes no sense to release Flume 1.9 with
> > support for ElasticSearch 0.90.1.
> > > And it doesn't make sense either to release Flume 1.9 with support for
> a
> > component that relies on Lucene 4.3.0.
> > >
> > > We will always have dependency problems, if one component is updated to
> > a newer version and others not.
> > >
> > > To get rid of the conflict is probably to set "optional" to true in the
> > pom.xml.
> > > Then we don't distribute those Jars in the flume lib and the users need
> > to set their class path to point e.g. to Elasticsearch and then the
> correct
> > libs will be used.
> > > I doubt that a Elasticsearch user will also use some other component
> > with e.g. this outdated Lucene.
> > >
> > > best regards,
> > >
> > > Helmut
> > >
> > >
> > > -Original Message-
> > > From: Mike Percy [mailto:mpe...@apache.org]
> > > Sent: Donnerstag, 25. Jänner 2018 23:01
> > > To: dev@flume.apache.org
> > > Subject: Re: Merging to trunk?
> > >
> > > Hi Helmut,
> > > I wrote a small Perl script to identify dependency conflicts based on
> > output from mvn dependency:tree and posted it here:
> > https://gist.github.com/
> > > mpercy/39614d770864bdd0c386befd5e8a1840
> > >
> > > I ran that on the current trunk and it actually found some errors which
> > should be fixed:
> > >
> > > Version conflict: package com.google.guava:guava:jar needed in 2
> > different
> > > versions: (11.0.2, 18.0)
> > > Version conflict: package commons-httpclient:commons-httpclient:jar
> > need

Re: Squash commits on trunk

2018-01-28 Thread Mike Percy
+1 from me. Thanks for the cleanup, Denes!

Mike

On Fri, Jan 26, 2018 at 1:17 PM, Ralph Goers <ralph.go...@dslextreme.com>
wrote:

> This looks correct to me.
>
> Ralph
>
> > On Jan 26, 2018, at 8:45 AM, Denes Arvay <de...@cloudera.com> wrote:
> >
> > Hi Flume Community,
> >
> > I have squashed the previously mentioned commits on my fork, I'd be happy
> > if you could have a look on it:
> > https://github.com/adenes/flume/commits/squashed-log4j-upgrade
> >
> > I have compared the source files with the current trunk (commit:
> ffc5554),
> > found no difference.
> > I also compiled trunk and my branch and compared the class files, the
> only
> > difference was the
> > auto-generated ./flume-ng-core/target/classes/org/apache/flume/
> package-info.class
> > file, which contains the branch name, commit hash, etc.
> >
> > This is the new commit
> > https://github.com/adenes/flume/commit/69c66efefdcd74904986f2727bdf0d
> 52dd9a75e5
> > which
> > was created by squashing the following commits:
> >
> > fbc7a68 Merge branch 'trunk' into flume-2050
> > 6813d9c Upgrade to Log4j 2.10.0
> > e4fd6ab Remove more references to log4j 1
> > 6b6605c Update configuration to match log4j 1.x
> > 4bb5e88 FLUME-2050 - modify pattern layout so NDC is ignored if it has no
> > data
> > 4a07fbf FLUME-2050 remove spurious files
> > 140ea5d FLUME-2050 Upgrade to Log4j 2
> >
> > If there are no objections I'll force push this to the trunk.
> > (Note: it might mess up the git-wip-us.apache.org -> github repo
> mirroring,
> > if that happens I'll get in touch with Apache Infra to sort it out)
> >
> > Regards,
> > Denes
> >
> >
> > On Wed, Jan 17, 2018 at 12:00 AM Mike Percy <mpe...@apache.org> wrote:
> >
> >> I agree squash-before-push is a good policy to maintain a readable
> commit
> >> history.
> >>
> >> I'd be +1 to doc this and squash the relevant commits.
> >>
> >> Mike
> >>
> >> On Wed, Jan 10, 2018 at 5:37 AM, Denes Arvay <de...@cloudera.com>
> wrote:
> >>
> >>> Hi Hari,
> >>>
> >>> Thank you for your answer.
> >>> I think having one single commit with a structured commit message
> >> belonging
> >>> to one Jira ticket has several benefits:
> >>> - it makes it easier to cherry-pick/backport fixes to release branches
> >>> - simplifies the commit history and avoids having different ways for
> >>> different committers to merge the changes
> >>> - makes it possible to give credit to the authors and reviewers
> >>>
> >>> So I suggest to keep the squash-before-pushing policy but I'm open for
> >> more
> >>> inputs, recommendations as well.
> >>>
> >>> Best,
> >>> Denes
> >>>
> >>> On Tue, Jan 9, 2018 at 10:55 PM Hari Shreedharan <
> >> hshreedha...@apache.org>
> >>> wrote:
> >>>
> >>>> I don't have any objections to that, but I have to wonder if it makes
> >>> sense
> >>>> to update the guidelines to actually not have to squash commits. I
> >> think
> >>>> the reason we needed to squash those commits was that we were
> >> originally
> >>> on
> >>>> SVN and having multiple commits didn't make much sense in SVN. It is
> >> easy
> >>>> to track history with a single commit, but that looks to be the case
> >>> anyway
> >>>> (I just see 1 merge commit, which is fine - it is an artifact of pull
> >>>> request merges).
> >>>>
> >>>> That said, I don't have an objection to force-pushing, we just need to
> >>> make
> >>>> sure no history is lost.
> >>>>
> >>>> On Tue, Jan 9, 2018 at 1:03 AM, Denes Arvay <de...@cloudera.com>
> >> wrote:
> >>>>
> >>>>> Hi Flume Community,
> >>>>>
> >>>>> A couple of commits went in to trunk recently which weren't in line
> >>> with
> >>>>> our commit guidelines.
> >>>>> I suggest to squash these commits to one and do a force push to
> >> resolve
> >>>>> this issue, plus - as the guidelines are not clear enough - I'd like
> >> to
> >>>>> extend the
> >>>>> https://github.com/apache/flume/blob/trunk/dev-docs/HowToCommit.md
> >> doc
> >>>> to
> >>>>> be more concrete on the requirements for a commit. These rules are
> >>>>> currently mostly unwritten, so it'd be useful to clarify them.
> >>>>>
> >>>>> I'm happy to do these if there is no objection from the community.
> >>>>>
> >>>>> Regards,
> >>>>> Denes
> >>>>>
> >>>>
> >>>
> >>
>
>
>


Re: Merging to trunk?

2018-01-25 Thread Mike Percy
Hi Helmut,
I wrote a small Perl script to identify dependency conflicts based on
output from mvn dependency:tree and posted it here: https://gist.github.com/
mpercy/39614d770864bdd0c386befd5e8a1840

I ran that on the current trunk and it actually found some errors which
should be fixed:

Version conflict: package com.google.guava:guava:jar needed in 2 different
versions: (11.0.2, 18.0)
Version conflict: package commons-httpclient:commons-httpclient:jar needed
in 2 different versions: (3.0.1, 3.1)
Version conflict: package commons-logging:commons-logging:jar needed in 2
different versions: (1.1.3, 1.2)
Version conflict: package commons-pool:commons-pool:jar needed in 2
different versions: (1.5.4, 1.6)
Version conflict: package net.sf.jopt-simple:jopt-simple:jar needed in 2
different versions: (3.2, 4.7)
Version conflict: package org.codehaus.jackson:jackson-jaxrs:jar needed in
2 different versions: (1.8.3, 1.8.8)
ERROR: 6 package version conflicts identified

However, after applying your patch on top of trunk and running it again, it
identified new conflicts:

Version conflict: package com.google.guava:guava:jar needed in 2 different
versions: (11.0.2, 18.0)
Version conflict: package commons-httpclient:commons-httpclient:jar needed
in 2 different versions: (3.0.1, 3.1)
Version conflict: package commons-logging:commons-logging:jar needed in 2
different versions: (1.1.3, 1.2)
Version conflict: package commons-pool:commons-pool:jar needed in 2
different versions: (1.5.4, 1.6)
Version conflict: package net.sf.jopt-simple:jopt-simple:jar needed in 3
different versions: (3.2, 4.7, 5.0.2)
Version conflict: package org.apache.lucene:lucene-analyzers-common:jar
needed in 2 different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-core:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-grouping:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-highlighter:jar needed
in 2 different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-memory:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-misc:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-queries:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-queryparser:jar needed
in 2 different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-spatial:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.apache.lucene:lucene-suggest:jar needed in 2
different versions: (4.3.0, 7.1.0)
Version conflict: package org.codehaus.jackson:jackson-jaxrs:jar needed in
2 different versions: (1.8.3, 1.8.8)
Version conflict: package org.yaml:snakeyaml:jar needed in 2 different
versions: (1.10, 1.17)
ERROR: 17 package version conflicts identified

So while it appears Guava is no longer a concern, there are other
conflicting dependencies involved in the conflict here.

I'm interested to hear your thoughts on this.

Regards,
Mike

On Wed, Jan 24, 2018 at 7:57 AM, Wahrmann, Helmut <helmut.wahrm...@rsa.com>
wrote:

> Hi Mike,
>
> Thanks for the quick response.
>
> I fully agree that we need to take care about interop between the
> different components and dependencies.
> But if you look at the patch in FLUME-3021, it only puts in Elasticsearch
> dependencies.
>
> The issue with Guava, which you saw in FLUME-2921 is no longer there,
> because since mid-2016 a lot has changed in the Flume trunk and all those
> dependencies are fixed.
> It won't even be possible gto apply this patch anymore.
>
> The patch in FLUME-3021 works with the latest Elasticsearch versions,
> without introducing additional dependencies, which might cause problems to
> other projects.
> So I think it can be safely merged into trunk.
>
> best regards,
>
> Helmut
>
>
> -Original Message-
> From: Mike Percy [mailto:mpe...@apache.org]
> Sent: Dienstag, 23. Jänner 2018 20:39
> To: dev@flume.apache.org
> Subject: Re: Merging to trunk?
>
> Hi Helmut,
> Thank you for bringing this up on dev@ and thank you for the patch. I see
> there are other people people interested in this component upgrade as well.
>
> As you are probably aware, a Flume committer will need to approve the
> change before I gets merged to trunk.
>
> My primary concern w/ merging this would be compatibility of the
> dependencies. Flume suffers from a kind of a "dependency hell" because of a
> lack of classloading support (either via OSGI or Java modules). See
> https://issues.apache.org/jira/browse/FLUME-2293 for a tracking ticket
> about that issue. What that means is that all of the components that Flume
> ships must have compatible dependencies with eac

Re: Merging to trunk?

2018-01-23 Thread Mike Percy
Hi Helmut,
Thank you for bringing this up on dev@ and thank you for the patch. I see
there are other people people interested in this component upgrade as well.

As you are probably aware, a Flume committer will need to approve the
change before I gets merged to trunk.

My primary concern w/ merging this would be compatibility of the
dependencies. Flume suffers from a kind of a "dependency hell" because of a
lack of classloading support (either via OSGI or Java modules). See
https://issues.apache.org/jira/browse/FLUME-2293 for a tracking ticket
about that issue. What that means is that all of the components that Flume
ships must have compatible dependencies with each other which makes changes
like this more complex. Therefore I would like someone to verify that mvn
dependency:tree does not show conflicts when run from the top level with
the new patches. If memory serves, I believe Google Guava is likely to
conflict.

Also, I think that https://issues.apache.org/jira/browse/FLUME-3021 is a
duplicate of https://issues.apache.org/jira/browse/FLUME-2921 which has
additional information about the various issues that we need to solve to do
this upgrade.

I'd be happy to discuss this issue some more on this thread.

Mike


On Tue, Jan 23, 2018 at 12:57 AM, Wahrmann, Helmut 
wrote:

> Hi,
>
> Who decides if/when a patch or pull request gets merged to trunk?
> Reason I am asking is for the Elasticsearch support. The current code in
> trunk does not work with ES > 2.x.
> Currently Elasticsearch is at 6.1.
>
> In FLUME-3021 we have several patches since March last year.
> I have a patched Flume running at a customer since April last year without
> any problems.
>
> So why not merging those changes into the trunk?
> As I stated in my comment in FLUME-3021, it will not cause any problems,
> cause the current trunk won't work with newer ES versions anyhow.
> I doubt that someone is out there still running ES 0.95.
>
> best regards,
>
> Helmut
>


Re: [ANNOUNCE] New Flume PMC Chair

2018-01-18 Thread Mike Percy
Thanks Hari for your kind note! Thanks very much also for all your
consistency and hard work as Flume PMC chair over the last 2+ years.

I'm excited to take on this new role. I hope that folks will reach out with
ideas and suggestions they may have for the project, code wise or community
wise.

Best regards,
Mike

On Thu, Jan 18, 2018 at 12:37 PM, Hari Shreedharan <hshreedha...@apache.org>
wrote:

> Hi all,
>
> It gives me immense happiness to announce that the Apache Software
> Foundation Board has appointed Mike Percy as the new PMC chair of the
> Apache Flume Project. Mike has contributed immensely to the project, and is
> one of the most active contributors to the project.
>
> I am confident that Mike will do an amazing job as the chair of the PMC.
> Please join me in congratulating Mike and welcoming his to this new role!
>
> Thanks,
> Hari
>


Re: Working with Jira

2018-01-18 Thread Mike Percy
Hi Helmut,
I think I just added you as a contributor in Flume which would allow you to
assign yourself a JIRA. If it didn't work, send me your username in JIRA.

Sending a GitHub pull request is the preferred method to contribute a patch
these days.

Best,
Mike

On Thu, Jan 18, 2018 at 5:55 AM, Wahrmann, Helmut 
wrote:

> Hi,
>
> In your developing guidelines you state, that if I would like to work on
> an issue I could assign it to myself:
>
> https://github.com/apache/flume/blob/trunk/CONTRIBUTING.md
>
> Well, I was able to create a Jira, but I am not able to assign it to
> myself. Seems I am missing authorization.
>
> Also, what do you consider better way?
>
> - Submitting a patch, which is attached to the Jira or
> Submitting a pull request and update the Jira that a Pull request has been
> submitted.
>
> Thanks,
> Helmut
>
>


Re: Squash commits on trunk

2018-01-16 Thread Mike Percy
I agree squash-before-push is a good policy to maintain a readable commit
history.

I'd be +1 to doc this and squash the relevant commits.

Mike

On Wed, Jan 10, 2018 at 5:37 AM, Denes Arvay  wrote:

> Hi Hari,
>
> Thank you for your answer.
> I think having one single commit with a structured commit message belonging
> to one Jira ticket has several benefits:
> - it makes it easier to cherry-pick/backport fixes to release branches
> - simplifies the commit history and avoids having different ways for
> different committers to merge the changes
> - makes it possible to give credit to the authors and reviewers
>
> So I suggest to keep the squash-before-pushing policy but I'm open for more
> inputs, recommendations as well.
>
> Best,
> Denes
>
> On Tue, Jan 9, 2018 at 10:55 PM Hari Shreedharan 
> wrote:
>
> > I don't have any objections to that, but I have to wonder if it makes
> sense
> > to update the guidelines to actually not have to squash commits. I think
> > the reason we needed to squash those commits was that we were originally
> on
> > SVN and having multiple commits didn't make much sense in SVN. It is easy
> > to track history with a single commit, but that looks to be the case
> anyway
> > (I just see 1 merge commit, which is fine - it is an artifact of pull
> > request merges).
> >
> > That said, I don't have an objection to force-pushing, we just need to
> make
> > sure no history is lost.
> >
> > On Tue, Jan 9, 2018 at 1:03 AM, Denes Arvay  wrote:
> >
> > > Hi Flume Community,
> > >
> > > A couple of commits went in to trunk recently which weren't in line
> with
> > > our commit guidelines.
> > > I suggest to squash these commits to one and do a force push to resolve
> > > this issue, plus - as the guidelines are not clear enough - I'd like to
> > > extend the
> > > https://github.com/apache/flume/blob/trunk/dev-docs/HowToCommit.md doc
> > to
> > > be more concrete on the requirements for a commit. These rules are
> > > currently mostly unwritten, so it'd be useful to clarify them.
> > >
> > > I'm happy to do these if there is no objection from the community.
> > >
> > > Regards,
> > > Denes
> > >
> >
>


Re: Message Lists

2017-12-19 Thread Mike Percy
Hi Ralph,
My opinion is that we should move JIRA traffic to issues@flume.a.o and keep
review traffic (Reviewboard, GitHub PRs) on the dev list for now, because
the code review requests are relatively low volume and they are certainly
relevant to Flume developers - more so than much of the issue traffic.
FWIW, the commits already go to commits@.

We could try that out for a few months, and if automated traffic is still
interfering with our ability to have email conversations between human
individuals then we could also split that out into reviews@flume.a.o.
That's what we do in Apache Kudu but it seems like overkill for a project
with the volume of pull requests that Flume has right now.

My 2 cents,
Mike

On Tue, Dec 19, 2017 at 7:48 AM, Ralph Goers <ralph.go...@dslextreme.com>
wrote:

> I’ll give this another day and if there are no objections I will go ahead
> and create Infra issues to do this.
>
> Ralph
>
> > On Dec 12, 2017, at 9:46 PM, Ralph Goers <ralph.go...@dslextreme.com>
> wrote:
> >
> > I am proposing that we create a new list for automated messages. I am
> forwarding it here since it has only gotten replies from Hari and Mike on
> the dev list. The discussion really belongs on the dev list but you may not
> be seeing it due to the noise.
> >
> > Ralph
> >
> >> Begin forwarded message:
> >>
> >> From: Hari Shreedharan <hshreedha...@apache.org>
> >> Subject: Re: Message Lists
> >> Date: December 11, 2017 at 1:41:11 PM MST
> >> To: dev@flume.apache.org
> >> Reply-To: dev@flume.apache.org
> >>
> >> +1. I agree, we should move the private messages out.
> >>
> >> On Sun, Dec 10, 2017 at 12:14 PM, Mike Percy <mpe...@apache.org> wrote:
> >>
> >>> If necessary due to noise we can take this discussion back to private@
> >>> for a check-in.
> >>>
> >>> Mike
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On Dec 10, 2017, at 9:15 AM, Ralph Goers <ralph.go...@dslextreme.com>
> >>> wrote:
> >>>>
> >>>> Thanks Mike. Yours is the only feedback in a month. I am uncomfortable
> >>> contacting infra to make the changes based on so little input.
> >>>>
> >>>> Ralph
> >>>>
> >>>>> On Dec 8, 2017, at 9:03 PM, Mike Percy <mpe...@apache.org> wrote:
> >>>>>
> >>>>> Sorry, I didn't see this message because of all the automated emails!
> >>>>>
> >>>>> +1 from me.
> >>>>>
> >>>>> Mike
> >>>>>
> >>>>> On Sat, Nov 11, 2017 at 9:43 PM, Ralph Goers <
> >>> ralph.go...@dslextreme.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Currently all the messages from Jenkins, Jira and GitHub land in
> this
> >>>>>> mailing list. That makes this mailing list very cluttered and it is
> >>> easy to
> >>>>>> miss discussions. Other projects use a “notifications” list to
> accept
> >>>>>> emails from those sources so the dev list can be left for
> >>> person-to-person
> >>>>>> discussions.  I would like to propose that Flume switch to this
> model.
> >>>>>> Another alternative would be to have separate lists for each of
> those
> >>>>>> sources. My personal viewpoint is simply separating the automated
> >>> emails is
> >>>>>> enough but I’d be willing to go along with any plan that moves the
> >>>>>> automated emails to another list.  I also think that doing this
> might
> >>>>>> increase the number of subscribers on the Flume dev list as lots of
> >>> people
> >>>>>> don’t like to deal with all the extra email.
> >>>>>>
> >>>>>> All that said, it would be expected that all committers would
> >>> subscribe to
> >>>>>> these new lists.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>> Ralph
> >>>>>>
> >>>>
> >>>>
> >>>
> >>>
> >
>
>
>


Re: Message Lists

2017-12-10 Thread Mike Percy
If necessary due to noise we can take this discussion back to private@ for a 
check-in.

Mike

Sent from my iPhone

> On Dec 10, 2017, at 9:15 AM, Ralph Goers <ralph.go...@dslextreme.com> wrote:
> 
> Thanks Mike. Yours is the only feedback in a month. I am uncomfortable 
> contacting infra to make the changes based on so little input.
> 
> Ralph
> 
>> On Dec 8, 2017, at 9:03 PM, Mike Percy <mpe...@apache.org> wrote:
>> 
>> Sorry, I didn't see this message because of all the automated emails!
>> 
>> +1 from me.
>> 
>> Mike
>> 
>> On Sat, Nov 11, 2017 at 9:43 PM, Ralph Goers <ralph.go...@dslextreme.com>
>> wrote:
>> 
>>> Currently all the messages from Jenkins, Jira and GitHub land in this
>>> mailing list. That makes this mailing list very cluttered and it is easy to
>>> miss discussions. Other projects use a “notifications” list to accept
>>> emails from those sources so the dev list can be left for person-to-person
>>> discussions.  I would like to propose that Flume switch to this model.
>>> Another alternative would be to have separate lists for each of those
>>> sources. My personal viewpoint is simply separating the automated emails is
>>> enough but I’d be willing to go along with any plan that moves the
>>> automated emails to another list.  I also think that doing this might
>>> increase the number of subscribers on the Flume dev list as lots of people
>>> don’t like to deal with all the extra email.
>>> 
>>> All that said, it would be expected that all committers would subscribe to
>>> these new lists.
>>> 
>>> Thoughts?
>>> 
>>> Ralph
>>> 
> 
> 



Re: Message Lists

2017-12-08 Thread Mike Percy
Sorry, I didn't see this message because of all the automated emails!

+1 from me.

Mike

On Sat, Nov 11, 2017 at 9:43 PM, Ralph Goers 
wrote:

> Currently all the messages from Jenkins, Jira and GitHub land in this
> mailing list. That makes this mailing list very cluttered and it is easy to
> miss discussions. Other projects use a “notifications” list to accept
> emails from those sources so the dev list can be left for person-to-person
> discussions.  I would like to propose that Flume switch to this model.
> Another alternative would be to have separate lists for each of those
> sources. My personal viewpoint is simply separating the automated emails is
> enough but I’d be willing to go along with any plan that moves the
> automated emails to another list.  I also think that doing this might
> increase the number of subscribers on the Flume dev list as lots of people
> don’t like to deal with all the extra email.
>
> All that said, it would be expected that all committers would subscribe to
> these new lists.
>
> Thoughts?
>
> Ralph
>


[ANNOUNCE] New Flume committers and PMC member

2017-11-07 Thread Mike Percy
Hello Flume community,

I'm very happy to announce that we now have 2 new committers and 1 new PMC
member on Apache Flume!

The first of our new Flume committers is Attila Simon (sati), who has a
long and steady history of contributing patches to Flume, including
security and usability improvements, and contributing to code reviews on
many pull requests.

Our second new Flume committer is Ferenc Szabo (szaboferee), who has been
contributing improvements to Flume for just a few short months but has
shown a high level of activity and engagement during that time, including
helping out during the Flume 1.8.0 release process.

Last, but not least, our new Flume PMC member is Denes Arvay (denes), who
recently took on the responsibility of managing the release of Flume 1.8.0
in collaboration with Ferenc and others. Denes has been working very hard
during the last several months of his tenure as a committer, as well as
before that, and has been driving the project forward with his many
contributions including his own improvements, mentoring others, many code
reviews and commits, and more.

Please join me in welcoming these folks to their new roles and
congratulating them on a job well done!

Best regards,
Mike Percy on behalf of the Flume PMC


Re: [ANNOUNCE] Apache Flume 1.8.0 released

2017-10-06 Thread Mike Percy
Congrats all! Nice work.

Regards,
Mike

On Wed, Oct 4, 2017 at 9:57 AM, Denes Arvay  wrote:

> The Apache Flume team is pleased to announce the release of Flume
> version 1.8.0.
>
> Flume is a distributed, reliable, and available service for efficiently
> collecting, aggregating, and moving large amounts of log data.
>
> This release can be downloaded from the Flume download page at:
> http://flume.apache.org/download.html
>
> The change log and documentation are available on the 1.8.0 release page:
> http://flume.apache.org/releases/1.8.0.html
>
> Your help and feedback is more than welcome. For more information on how
> to report problems and to get involved, visit the project website at
> http://flume.apache.org/
>
> The Apache Flume Team
>


Re: [VOTE] Release Apache Flume version 1.8.0 RC2

2017-09-21 Thread Mike Percy
+1 (binding)

I only checked the source artifact, I didn't check the binary convenience
artifacts.

- Checksums and sigs match
- LICENSE file looks good
- README file looks good
- Files match git tag
- Ran a full build and all tests passed

Thanks for managing this release Denes, and to the others that helped!

Best,
Mike

On Fri, Sep 15, 2017 at 10:52 AM, Denes Arvay  wrote:

> Hi Flume Community,
>
> This is the eleventh release for Apache Flume as a top-level project,
> version 1.8.0. We are voting on release candidate RC2.
>
> It fixes the following issues:
>   https://raw.githubusercontent.com/apache/flume/release-1.8.0
> -rc2/CHANGELOG
>
> *** Please cast your vote within the next 72 hours ***
>
> The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> for
> the source and binary artifacts can be found here:
>   http://people.apache.org/~denes/apache-flume-1.8.0-rc2/
>
> Maven staging repo:
>   https://repository.apache.org/content/repositories/orgapacheflume-1026/
>
> The tag to be voted on:
>   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=43a3c40
>
> Flume's KEYS file containing PGP keys we use to sign the release:
>   https://www.apache.org/dist/flume/KEYS
>
> Thank you,
> Denes
>


Re: [VOTE] Release Apache Flume version 1.8.0 RC2

2017-09-19 Thread Mike Percy
Sorry for not voting yet. I've just been slammed. I'll take a look at RC2
tomorrow.

Mike

On Mon, Sep 18, 2017 at 5:46 AM, Denes Arvay  wrote:

> Hi All,
>
> I'd like to encourage everybody to check the RC2 and provide feedback about
> it. It's quite simple, just do the following steps:
> - Download the artifacts from the people.apache.org link I've sent in the
> initial email (see below)
> - Check the md5/sha1 checksums
> - Verify the signature: import the KEYS file (link below, howto in the KEYS
> file), then verify with gpg --verify apache-flume-1.8.0-src.tar.gz.asc
> and gpg --verify apache-flume-1.8.0-bin.tar.gz.asc
> - Check that the downloaded artifacts match the ones in the maven staging
> repo (org.apache.flume:flume-ng-dist:1.8.0, link to the repo below)
> - Extract the source tarball, compile and run the unit tests (note: some
> flaky tests might break, you might want to use
> the surefire.rerunFailingTestsCount flag)
> - Extract the binary tarball. You should be able to run it with one of the
> bin/flume-ng* scripts. (e.g. ./bin/flume-ng agent -c conf -f
> conf/flume-conf.properties.template -n agent
> -Dflume.root.logger=DEBUG,console)
> - Check the generated documents in the doc/ directory in the binary
> artifact.
> - Check the CHANGELOG and RELEASE-NOTES files.
> - Check that all the 3rd party jars are listed in the LICENSE file (the
> jars can be found in the binary package's lib/ directory)
>
> Of course if you do only a subset of these checks that'd be also a great
> help.
>
> Thank you,
> Denes
>
>
> On Fri, Sep 15, 2017 at 7:52 PM Denes Arvay  wrote:
>
> > Hi Flume Community,
> >
> > This is the eleventh release for Apache Flume as a top-level project,
> > version 1.8.0. We are voting on release candidate RC2.
> >
> > It fixes the following issues:
> >
> > https://raw.githubusercontent.com/apache/flume/release-1.8.
> 0-rc2/CHANGELOG
> >
> > *** Please cast your vote within the next 72 hours ***
> >
> > The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> for
> > the source and binary artifacts can be found here:
> >   http://people.apache.org/~denes/apache-flume-1.8.0-rc2/
> >
> > Maven staging repo:
> >   https://repository.apache.org/content/repositories/
> orgapacheflume-1026/
> >
> > The tag to be voted on:
> >   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=43a3c40
> >
> > Flume's KEYS file containing PGP keys we use to sign the release:
> >   https://www.apache.org/dist/flume/KEYS
> >
> > Thank you,
> > Denes
> >
>


Re: 1.8 RC1 failure - and how to fix it

2017-09-15 Thread Mike Percy
Hi Denes,
Shouldn't we use ./dev-support/generate-source-release.sh to generate the
source release? That guarantees that the source release matches the git
tag, and we don't have to worry about these kinds of weird problems.

Mike

On Thu, Sep 14, 2017 at 3:28 PM, Denes Arvay  wrote:

> Hi All,
>
> I have cancelled the RC1 vote because the source tarball was missing 90% of
> the required source code.
> It seems it was the same issue what we experienced during the 1.7 release:
> somehow the flume-checkstyle module interferes with the assembly plugin
> causing that none of the modules will be included in the source package.
>
> Back then the issue was solved by removing the flume-checkstyle module from
> the 1.7 release:
> https://github.com/apache/flume/commit/38b5b3a7ed98cedaaef2b9351518a9
> fe09703a05
>
> I did some trial and error and I was able to fix this eventually by:
> a) changing flume-checkstyle's parent to flume-parent instead of the apache
> parent. This in itself breaks the build, so I had to
> b) set the packaging of flume-checkstyle to "pom"
> This fixes the packaging issues plus some other weird problems we bumped
> into during the release.
> Please have a look on my pull request and check whether Flume compiles and
> the packaging is ok (after running an mvn clean install check
> the flume-ng-dist/target/apache-flume-1.9.0-SNAPSHOT-src/
> apache-flume-1.9.0-SNAPSHOT-src
> directory, it should contain all the modules). (It worked on my machine,
> but I'd like to have some extra verification)
> https://github.com/apache/flume/pull/174
>
> Once this is committed I'll continue working on the RC2.
>
> Thanks,
> Denes
>


Re: [DISCUSS] Flume 1.8 release proposal

2017-09-12 Thread Mike Percy
I added the additional versions earlier today but neglected to notify the
list until just now. :)

+1 on the plan. Thanks for keeping us updated and continuing to drive this
release, Denes!

Mike

On Tue, Sep 12, 2017 at 3:04 AM, Denes Arvay  wrote:

> Hi Donat,
>
> Thanks for your help.
> Ferenc Szabo is already working on the retargeting, but it's definitely a
> good advice to do it in bulk to avoid spamming the lists.
>
> We have the following action items:
> - retarget the tickets
> - fix the blockers: there is only one left which I'm aware of:
> https://issues.apache.org/jira/browse/FLUME-3174. To fix that upgrading
> the
> joda-time is in progress: https://github.com/apache/flume/pull/169
> - the user guide is broken (netcat udp source's table is malformed), I'm
> fixing it
> - https://github.com/apache/flume/pull/168 needs to be committed. I've
> seen
> that you've already commented on it, thank you, will reply soon. (Spoiler:
> a lot of effort, unfortunately)
> -  some minor changes need to be done in the documentation (e.g. fixing the
> copyright dates, removing/updating the version references in the user
> guide). If anybody in the community feels like doing it I'd be more than
> happy to review & commit the changes.
> - once these are done I'm going to create the 1.8 branch and create the RC1
> release artifact. I'll announce the branching in advance to the dev@ list.
>
> Thank you,
> Denes
>
>
> On Tue, Sep 12, 2017 at 11:18 AM Bessenyei Balázs Donát  >
> wrote:
>
> > Hi Denes,
> >
> > It seems to me that 1.8.1 and 1.9.0 releases already exist in our JIRA.
> >
> > Regarding the retargeting: I'd be happy to batch-edit the necessary
> > tickets in order to avoid spamming the mailing lists.
> > Once you have a list of actions you'd like to do, please let us know.
> >
> >
> > Thank you,
> >
> > Donat
> >
> > 2017-09-11 19:39 GMT+02:00 Denes Arvay :
> > > Hi All,
> > >
> > > I'd like to let you know that we are planning to cut the 1.8 branch
> > > tomorrow around 2am PDT.
> > > If you think that there any important tickets targeted to 1.8 still
> open
> > > which needs to be reviewed and committed to get into the release,
> please
> > > let us know as soon as possible and we'll do our best to push it
> through.
> > >
> > > The ones which couldn't get committed by the branching will be
> retargeted
> > > to 1.8.1 or 1.9, depending on their type (i.e. bug fixes will be
> > retargeted
> > > to 1.8.1, new features will be scheduled for 1.9).
> > > For this I'd like to ask our PMC members to create these new releases
> in
> > > Jira, or if it's possible to grant the required Jira permissions to me,
> > I'd
> > > be more than happy to do this.
> > >
> > > Thank you,
> > > Denes
> > >
> > > On Mon, Sep 4, 2017 at 10:21 AM Denes Arvay 
> wrote:
> > >
> > >> Hi Flume Community,
> > >>
> > >> Almost a year passed since we've released Flume 1.7.
> > >> More than 50 commits were pushed since then, including documentation
> > >> fixes, many critical bug fixes and several important features, so I'd
> > like
> > >> to propose to publish the next minor release of Flume.
> > >>
> > >> I'd be happy to be the Release Manager with the help of Ferenc Szabo
> and
> > >> Marcell Hegedus who have been quite active recently, and Balazs Donat
> > >> Bessenyei who took the lion's share of the work during the previous
> > release
> > >> - if both community and they are OK with it.
> > >>
> > >> Among others the following major changes will be included in the next
> > >> release:
> > >>
> > >> Fixed bugs:
> > >> - FLUME-2857. Make Kafka Source/Channel/Sink restore default values
> when
> > >> live updating config
> > >> - FLUME-2812. Fix semaphore leak causing java.lang.Error: Maximum
> permit
> > >> count exceeded in MemoryChannel
> > >> - FLUME-3020. Improve HDFS Sink escape sequence substitution
> > >> - FLUME-3027. Change Kafka Channel to clear offsets map after commit
> > >> - FLUME-3049. Make HDFS sink rotate more reliably in secure mode
> > >> - FLUME-3080. Close failure in HDFS Sink might cause data loss
> > >> - FLUME-3085. HDFS Sink can skip flushing some BucketWriters, might
> lead
> > >> to data loss
> > >> - FLUME-2752. Fix AvroSource startup resource leaks
> > >> - FLUME-2905. Fix NetcatSource file descriptor leak if startup fails
> > >>
> > >> New features:
> > >> - FLUME-2171. Add Interceptor to remove headers from event
> > >> - FLUME-2993. Add support for environment variables in configuration
> > files
> > >> - New component: HTTP Sink
> > >> - FLUME-3100. Support arbitrary header substitution for topic of Kafka
> > Sink
> > >> - FLUME-2917. Provide netcat UDP source as alternative to TCP
> > >>
> > >> There are 35 open tickets targeted for 1.8 in patch available state:
> > >> https://s.apache.org/flume-1.8-target-tickets
> > >>
> > >> Plus we also have quite a lot (~65) open pull requests on GitHub:
> > >> 

Re: [DISCUSS] Flume 1.8 release proposal

2017-09-05 Thread Mike Percy
Hi Denes,
+1 from me for releasing a Flume 1.8.0 and for you taking on the RM role
for Flume 1.8.0. The timeline for a first RC seems fine.

It sounds like you will have some help, which is good. Let me know if you
need anything from me.

Regards,
Mike

On Mon, Sep 4, 2017 at 1:21 AM, Denes Arvay  wrote:

> Hi Flume Community,
>
> Almost a year passed since we've released Flume 1.7.
> More than 50 commits were pushed since then, including documentation fixes,
> many critical bug fixes and several important features, so I'd like to
> propose to publish the next minor release of Flume.
>
> I'd be happy to be the Release Manager with the help of Ferenc Szabo and
> Marcell Hegedus who have been quite active recently, and Balazs Donat
> Bessenyei who took the lion's share of the work during the previous release
> - if both community and they are OK with it.
>
> Among others the following major changes will be included in the next
> release:
>
> Fixed bugs:
> - FLUME-2857. Make Kafka Source/Channel/Sink restore default values when
> live updating config
> - FLUME-2812. Fix semaphore leak causing java.lang.Error: Maximum permit
> count exceeded in MemoryChannel
> - FLUME-3020. Improve HDFS Sink escape sequence substitution
> - FLUME-3027. Change Kafka Channel to clear offsets map after commit
> - FLUME-3049. Make HDFS sink rotate more reliably in secure mode
> - FLUME-3080. Close failure in HDFS Sink might cause data loss
> - FLUME-3085. HDFS Sink can skip flushing some BucketWriters, might lead to
> data loss
> - FLUME-2752. Fix AvroSource startup resource leaks
> - FLUME-2905. Fix NetcatSource file descriptor leak if startup fails
>
> New features:
> - FLUME-2171. Add Interceptor to remove headers from event
> - FLUME-2993. Add support for environment variables in configuration files
> - New component: HTTP Sink
> - FLUME-3100. Support arbitrary header substitution for topic of Kafka Sink
> - FLUME-2917. Provide netcat UDP source as alternative to TCP
>
> There are 35 open tickets targeted for 1.8 in patch available state:
> https://s.apache.org/flume-1.8-target-tickets
>
> Plus we also have quite a lot (~65) open pull requests on GitHub:
> https://github.com/apache/flume/pulls
>
> Some of the above mentioned tickets/pull requests already have some review
> comments, so at least part of this list can get into this release beside
> the already pushed ones.
>
> I'd like to propose to target the week of 11th of September with the first
> release candidate. That'd mean that the branch date would be the 11th, any
> significant code change should get in by that date.
>
> If nobody has any concerns then I'm going to create an umbrella ticket to
> track the release process.
>
> Kind regards,
> Denes
>


[jira] [Commented] (FLUME-3115) Upgrade netty library dependency

2017-07-05 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075293#comment-16075293
 ] 

Mike Percy commented on FLUME-3115:
---

The CVE says versions of Netty prior to 3.9.2 are vulnerable to a DoS attack 
when using SslHandler. Curator is pulling in the old netty version. The version 
that Flume depends on (looking at trunk) is 3.9.4 but it's possible that since 
both are on the classpath either one may actually be being used.

Really, Curator and Flume should both probably be shading Netty.

Flume may be vulnerable to this DoS today because it uses SslHandler in a 
couple of places:

{code}
$ ag -l SslHandler
flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java
flume-ng-core/src/test/java/org/apache/flume/source/TestAvroSource.java
flume-ng-core/src/test/java/org/apache/flume/sink/TestAvroSink.java
flume-ng-sdk/src/main/java/org/apache/flume/api/NettyAvroRpcClient.java
{code}

> Upgrade netty library dependency
> 
>
> Key: FLUME-3115
> URL: https://issues.apache.org/jira/browse/FLUME-3115
> Project: Flume
>  Issue Type: Bug
>Affects Versions: 1.7.0
>Reporter: Attila Simon
>Priority: Critical
>  Labels: dependency
> Fix For: 1.8.0
>
>
> ||Group||Artifact||Version used||Upgrade target||
> |io.netty|netty|3.2.2.Final, 3.9.4.Final|4.1.12.Final|
> Note: This artifact was moved to:
> - New Group   io.netty
> - New Artifactnetty-all
> Security vulnerability: http://www.cvedetails.com/cve/CVE-2014-3488/
> Please do:
> - double check the newest version. 
> - consider to remove a dependency if better alternative is available.
> - check whether the lib change would introduce a backward incompatibility (in 
> which case please add this label `breaking_change` and fix version should be 
> the next major)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Update 3rd party dependencies

2017-07-05 Thread Mike Percy
Hi Attila,
Thanks for sending this. I have a few thoughts / questions on this:

1) You didn't include the analysis of A,G,S, etc. for the listed
dependencies in your email.
2) If there are security vulnerabilities reported that could affect Flume
then we should upgrade those dependencies where possible. However, in my
experience newer does not always mean better (a newer library may introduce
new bugs in exchange for new features we do not use) so I am not sure I
agree with the basic premise that we should avoid being on older versions
of libraries.
3) From a quick look at mvn dependency:tree the majority of those libs are
pulled in transitively by other projects. How do you propose dealing with
that?

I ran a quick script based on mvn dependency:tree and your list above and
marked the libraries you mentioned with an arrow (<---) to illustrate where
they come from (see below). Hope this is useful.

Thanks,
Mike

[INFO] --- maven-dependency-plugin:2.10:tree (default-cli) @
flume-checkstyle ---
[INFO] org.apache.flume:flume-checkstyle:jar:1.8.0-SNAPSHOT
[INFO]

[INFO]

[INFO] Building Apache Flume 1.8.0-SNAPSHOT
[INFO]

[INFO]
[INFO] --- maven-dependency-plugin:2.10:tree (default-cli) @ flume-parent
---
[INFO] org.apache.flume:flume-parent:pom:1.8.0-SNAPSHOT
[INFO]

[INFO]

[INFO] Building Flume NG SDK 1.8.0-SNAPSHOT
[INFO]

[INFO]
[INFO] --- maven-dependency-plugin:2.10:tree (default-cli) @ flume-ng-sdk
---
[INFO] org.apache.flume:flume-ng-sdk:jar:1.8.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.10:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.1:test
[INFO] +- org.slf4j:slf4j-api:jar:1.6.1:compile
[INFO] +- org.slf4j:slf4j-log4j12:jar:1.6.1:compile
[INFO] |  \- log4j:log4j:jar:1.2.17:compile
[INFO] +- org.apache.avro:avro:jar:1.7.4:compile
[INFO] |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.3:compile   <---
[INFO] |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.3:compile
[INFO] |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
[INFO] |  +- org.xerial.snappy:snappy-java:jar:1.1.0:compile
[INFO] |  \- org.apache.commons:commons-compress:jar:1.4.1:compile
[INFO] | \- org.tukaani:xz:jar:1.0:compile
[INFO] +- org.apache.avro:avro-ipc:jar:1.7.4:compile
[INFO] |  +- org.mortbay.jetty:jetty:jar:6.1.26:compile   <---
[INFO] |  +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile   <---
[INFO] |  \- org.apache.velocity:velocity:jar:1.7:compile
[INFO] | \- commons-collections:commons-collections:jar:3.2.2:compile
[INFO] +- io.netty:netty:jar:3.9.4.Final:compile   <---
[INFO] \- org.apache.thrift:libthrift:jar:0.9.0:compile
[INFO]+- commons-lang:commons-lang:jar:2.5:compile
[INFO]+- org.apache.httpcomponents:httpclient:jar:4.2.1:compile   <---
[INFO]|  +- commons-logging:commons-logging:jar:1.1.1:compile
[INFO]|  \- commons-codec:commons-codec:jar:1.8:compile
[INFO]\- org.apache.httpcomponents:httpcore:jar:4.1.3:compile
[INFO]

[INFO]

[INFO] Building Flume NG Configuration 1.8.0-SNAPSHOT
[INFO]

[INFO]
[INFO] --- maven-dependency-plugin:2.10:tree (default-cli) @
flume-ng-configuration ---
[INFO] org.apache.flume:flume-ng-configuration:jar:1.8.0-SNAPSHOT
[INFO] +- org.slf4j:slf4j-api:jar:1.6.1:compile
[INFO] +- junit:junit:jar:4.10:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.1:test
[INFO] +- org.slf4j:slf4j-log4j12:jar:1.6.1:compile
[INFO] |  \- log4j:log4j:jar:1.2.17:compile
[INFO] +- com.google.guava:guava:jar:11.0.2:compile
[INFO] |  \- com.google.code.findbugs:jsr305:jar:1.3.9:compile
[INFO] \- org.apache.flume:flume-ng-sdk:jar:1.8.0-SNAPSHOT:compile
[INFO]+- org.apache.avro:avro:jar:1.7.4:compile
[INFO]|  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.3:compile
<---
[INFO]|  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.3:compile
[INFO]|  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
[INFO]|  +- org.xerial.snappy:snappy-java:jar:1.1.0:compile
[INFO]|  \- org.apache.commons:commons-compress:jar:1.4.1:compile
[INFO]| \- org.tukaani:xz:jar:1.0:compile
[INFO]+- org.apache.avro:avro-ipc:jar:1.7.4:compile
[INFO]|  +- org.mortbay.jetty:jetty:jar:6.1.26:compile   <---
[INFO]|  +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile   <---
[INFO]|  \- org.apache.velocity:velocity:jar:1.7:compile
[INFO]| \- commons-collections:commons-collections:jar:3.2.2:compile
[INFO]+- io.netty:netty:jar:3.9.4.Final:compile   <---
[INFO]\- org.apache.thrift:libthrift:jar:0.9.0:compile
[INFO]   +- commons-lang:commons-lang:jar:2.5:compile
[INFO]   +- 

[jira] [Commented] (FLUME-2957) Remove Guava from our public API

2017-07-05 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075107#comment-16075107
 ] 

Mike Percy commented on FLUME-2957:
---

I agree that we should simply expose a Map instead of the Guava ImmutableMap 
implementation as part of this public API.

> Remove Guava from our public API
> 
>
> Key: FLUME-2957
> URL: https://issues.apache.org/jira/browse/FLUME-2957
> Project: Flume
>  Issue Type: Task
>Affects Versions: 1.8.0
>Reporter: Lior Zeno
> Fix For: 2.0.0
>
>
> Context.getParameters (flume-ng-configuration module) returns 
> com.google.common.collect.ImmutableMap (Guava). We should clean our API and 
> return either a native java interface or Flume's.
> In addition to the current state being a bad practice, this also means that 
> we are unable to shade Guava in Flume.
> Note: Since this breaks our public API, I'll reschedule this issue to 2.0 
> once we have this version managed in jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[ANNOUNCE] New Flume committer - Denes Arvay

2017-05-21 Thread Mike Percy
On behalf of the Apache Flume PMC, I am very pleased to welcome Denes Arvay
as a committer on the Apache Flume project.

Denes has put a lot of effort into improving the stability of Flume, most
recently focusing on identifying and fixing serious and hard-to-diagnose
issues including several bugs that could cause data loss.

Congratulations and welcome, Denes!

Best,
Mike


[jira] [Resolved] (FLUME-3092) Extend the FileChannel's monitoring metrics

2017-05-16 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved FLUME-3092.
---
   Resolution: Fixed
Fix Version/s: 1.8.0

> Extend the FileChannel's monitoring metrics
> ---
>
> Key: FLUME-3092
> URL: https://issues.apache.org/jira/browse/FLUME-3092
> Project: Flume
>  Issue Type: Improvement
>  Components: File Channel
>Affects Versions: 1.7.0
>Reporter: Denes Arvay
>Assignee: Denes Arvay
> Fix For: 1.8.0
>
>
> There are already several generic metrics (e.g. {{eventPutAttemptCount}} and 
> {{eventPutSuccessCount}}) which can be used to create compound metrics for 
> monitoring the FileChannel's health.
> Some monitoring system's aren't capable to calculate such derived metrics, 
> though, so I recommend to add the following extra counters to represent if a 
> channel operation failed or the channel is in an unhealthy state.
> - {{eventPutErrorCount}}: incremented if an {{IOException}} occurs during 
> {{put}} operation.
> - {{eventTakeErrorCount}}: incremented if an {{IOException}} or 
> {{CorruptEventException}} occurs during {{take}} operation.
> - {{checkpointWriteErrorCount}}: incremented if an exception occurs during 
> checkpoint write.
> - {{unhealthy}}: this flag represents whether the channel has started 
> successfully (i.e. the replay ran without any problem). This is similar to 
> the already existing {{open}} flag except that the latter is initially false 
> and is set to {{true}} if the initialization (including the log replay) is 
> successfully done. The {{unhealthy}}, in contrary, is {{false}} by default 
> and is set to {{true}} if there is an error during startup.
> Beside these flags I'd also introduce a {{closed}} flag which is the numeric 
> representation (1: closed, 0: open) of the negated (already existing) 
> {{open}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLUME-3085) HDFS Sink can skip flushing some BucketWriters, might lead to data loss

2017-05-08 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved FLUME-3085.
---
Resolution: Fixed

Pushed to trunk.

> HDFS Sink can skip flushing some BucketWriters, might lead to data loss
> ---
>
> Key: FLUME-3085
> URL: https://issues.apache.org/jira/browse/FLUME-3085
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: 1.7.0
>Reporter: Denes Arvay
>Assignee: Denes Arvay
>Priority: Critical
> Fix For: 1.8.0
>
>
> The {{HDFSEventSink.process()}} is already prepared for a rare race 
> condition, namely when the BucketWriter acquired in [line 
> 389|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L389]
>  gets closed by an other thread (e.g. because the {{idleTimeout}} or the 
> {{rollInterval}}) before the {{append()}} is called in [line 
> 406|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L406].
> If this is the case the {{BucketWriter.append()}} call throws a 
> {{BucketClosedException}} and the sink creates a new {{BucketWriter}} 
> instance and appends to it.
> But this newly created instance won't be added to the {{writers}} list, which 
> means that it won't be flushed after the processing loop finished: 
> https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L429
> This has multiple consequences:
> - unflushed data might get lost
> - the {{BucketWriter}}'s {{idleAction}} won't be scheduled 
> (https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L450),
>  which means that it won't be closed nor renamed if the idle timeout is the 
> only trigger for closing the file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-3085) HDFS Sink can skip flushing some BucketWriters, might lead to data loss

2017-05-08 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-3085:
--
Fix Version/s: 1.8.0

> HDFS Sink can skip flushing some BucketWriters, might lead to data loss
> ---
>
> Key: FLUME-3085
> URL: https://issues.apache.org/jira/browse/FLUME-3085
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: 1.7.0
>Reporter: Denes Arvay
>Assignee: Denes Arvay
>Priority: Critical
> Fix For: 1.8.0
>
>
> The {{HDFSEventSink.process()}} is already prepared for a rare race 
> condition, namely when the BucketWriter acquired in [line 
> 389|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L389]
>  gets closed by an other thread (e.g. because the {{idleTimeout}} or the 
> {{rollInterval}}) before the {{append()}} is called in [line 
> 406|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L406].
> If this is the case the {{BucketWriter.append()}} call throws a 
> {{BucketClosedException}} and the sink creates a new {{BucketWriter}} 
> instance and appends to it.
> But this newly created instance won't be added to the {{writers}} list, which 
> means that it won't be flushed after the processing loop finished: 
> https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L429
> This has multiple consequences:
> - unflushed data might get lost
> - the {{BucketWriter}}'s {{idleAction}} won't be scheduled 
> (https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L450),
>  which means that it won't be closed nor renamed if the idle timeout is the 
> only trigger for closing the file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-2293) Isolate Flume agent plugins to their own classloader

2017-03-19 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2293:
--
Affects Version/s: (was: 1.7.0)
   v1.7.0

> Isolate Flume agent plugins to their own classloader
> 
>
> Key: FLUME-2293
> URL: https://issues.apache.org/jira/browse/FLUME-2293
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources, Technical Debt
>Affects Versions: 1.7.0
>Reporter: Joshua Hyde
>
> This is tangential to the FLUME-2286 issue I raised, but this would probably 
> negate it:
> It'd be nice if Flume plugins had classloaders isolated from the {{lib/}} 
> directory of Flume (and the Flume agent itself was isolated from the plugins 
> directory). This would allow plugins to exercise a bit more freedom in their 
> dependency stack (such as using more recent versions of Guava) without 
> interfering with the ability of the Flume agent to run (and without 
> interference from the agent's dependencies).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-2293) Isolate Flume agent plugins to their own classloader

2017-03-19 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2293:
--
Component/s: Technical Debt

> Isolate Flume agent plugins to their own classloader
> 
>
> Key: FLUME-2293
> URL: https://issues.apache.org/jira/browse/FLUME-2293
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources, Technical Debt
>Affects Versions: 1.7.0
>Reporter: Joshua Hyde
>
> This is tangential to the FLUME-2286 issue I raised, but this would probably 
> negate it:
> It'd be nice if Flume plugins had classloaders isolated from the {{lib/}} 
> directory of Flume (and the Flume agent itself was isolated from the plugins 
> directory). This would allow plugins to exercise a bit more freedom in their 
> dependency stack (such as using more recent versions of Guava) without 
> interfering with the ability of the Flume agent to run (and without 
> interference from the agent's dependencies).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-2293) Isolate Flume agent plugins to their own classloader

2017-03-19 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2293:
--
Affects Version/s: 1.7.0

> Isolate Flume agent plugins to their own classloader
> 
>
> Key: FLUME-2293
> URL: https://issues.apache.org/jira/browse/FLUME-2293
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources, Technical Debt
>Affects Versions: 1.7.0
>Reporter: Joshua Hyde
>
> This is tangential to the FLUME-2286 issue I raised, but this would probably 
> negate it:
> It'd be nice if Flume plugins had classloaders isolated from the {{lib/}} 
> directory of Flume (and the Flume agent itself was isolated from the plugins 
> directory). This would allow plugins to exercise a bit more freedom in their 
> dependency stack (such as using more recent versions of Guava) without 
> interfering with the ability of the Flume agent to run (and without 
> interference from the agent's dependencies).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [ANNOUNCE] Bessenyei Balázs Donát joining the Flume PMC

2017-03-11 Thread Mike Percy
> On Mar 11, 2017, at 12:57 AM, Mike Percy <mpe...@apache.org> wrote:
> 
> Donat was also instrumental to the latest release by acting as the release 
> manager for Flume 1.6.0.

Oops, I meant Flume 1.7.0 :)

Mike




[ANNOUNCE] Bessenyei Balázs Donát joining the Flume PMC

2017-03-11 Thread Mike Percy
Hi Flume community,

Today I'm very happy to announce that the PMC has voted to add Bessenyei
Balázs Donát (Donat) as a PMC member on the Apache Flume project!

Donat joined the Flume project as a committer back in September and since
that time has taken on the work of doing many code reviews and committing
many contributed patches. He also acted as a shepherd to include a new sink
into the main project: the HTTP Sink from Ben Wheeler. Donat was also
instrumental to the latest release by acting as the release manager for
Flume 1.6.0.

Thank you Donat for your ongoing contributions. Please join me in
congratulating Donat!

Mike


Re: Travis-CI build hung

2017-02-25 Thread Mike Percy
Oh, great! It was hung for 17 hours last time I checked but I guess Travis
solved the problem on its own.

Thanks,
Mike

On Sat, Feb 25, 2017 at 7:49 AM, Hari Shreedharan <hshreedha...@apache.org>
wrote:

> Looks like it passed. Must have been Travis being under provisioned or
> something
>
> On Feb 24, 2017 3:32 PM, "Mike Percy" <mpe...@apache.org> wrote:
>
> > The Travis pre-commit build request for PR #109 seems hung:
> > https://travis-ci.org/apache/flume/builds/204857948
> >
> > Does anybody know if I need to have some special permissions to cancel
> that
> > request and resubmit?
> >
> > Thanks,
> > Mike
> >
>


Travis-CI build hung

2017-02-24 Thread Mike Percy
The Travis pre-commit build request for PR #109 seems hung:
https://travis-ci.org/apache/flume/builds/204857948

Does anybody know if I need to have some special permissions to cancel that
request and resubmit?

Thanks,
Mike


[jira] [Assigned] (FLUME-3056) TestApplication hangs indefinitely

2017-02-23 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy reassigned FLUME-3056:
-

Assignee: Andras Beni

> TestApplication hangs indefinitely
> --
>
> Key: FLUME-3056
> URL: https://issues.apache.org/jira/browse/FLUME-3056
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Test
>Affects Versions: v1.7.0
>Reporter: Andras Beni
>Assignee: Andras Beni
>Priority: Minor
>
> Unit test hangs indefinitely when TestApplication.testFLUME1854() becomes 
> blocked in the following, deadlock-like situation:
> Application waits for PollingPropertiesFileConfigurationProvider to stop 
> while PollingPropertiesFileConfigurationProvider tries to notify Application 
> of configuration change.
> {noformat}
> "conf-file-poller-0" #17750 prio=5 os_prio=31 tid=0x7fdeb7972000 
> nid=0x638f waiting for monitor entry [0x7eb36000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.flume.node.Application.handleConfigurationEvent(Application.java:87)
>   - waiting to lock <0x00077a130178> (a 
> org.apache.flume.node.Application)
>   at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.google.common.eventbus.EventHandler.handleEvent(EventHandler.java:68)
>   at 
> com.google.common.eventbus.SynchronizedEventHandler.handleEvent(SynchronizedEventHandler.java:45)
>   - locked <0x00077a1301b0> (a 
> com.google.common.eventbus.SynchronizedEventHandler)
>   at com.google.common.eventbus.EventBus.dispatch(EventBus.java:313)
>   at 
> com.google.common.eventbus.EventBus.dispatchQueuedEvents(EventBus.java:296)
>   at com.google.common.eventbus.EventBus.post(EventBus.java:264)
>   at 
> org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - <0x00077a130250> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> "main" #1 prio=5 os_prio=31 tid=0x7fdeb900 nid=0x1b03 waiting on 
> condition [0x7d779000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0006c253e9f8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>   at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1465)
>   at 
> java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:675)
>   at 
> org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:88)
>   at 
> org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:104)
>   - locked <0x00077a138158> (a 
> org.apache.flume.lifecycle.LifecycleSupervisor)
>   at org.apache.flume.node.Application.stop(Application.java:92)
>   - locked <0x00077a130178> (a org.apache.flume.node.Application)
>   at 
> org.apache.flume.node.TestApplication.testFLUME1854(TestApplication.java:155)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-3055) Taildir source FD leaks when the matched files is renamed or rotated to other dir

2017-02-19 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873870#comment-15873870
 ] 

Mike Percy commented on FLUME-3055:
---

Thanks [~huLiu] for confirming!

> Taildir source FD leaks when the matched files is renamed or rotated to other 
> dir
> -
>
> Key: FLUME-3055
> URL: https://issues.apache.org/jira/browse/FLUME-3055
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Hu Liu,
>Assignee: Hu Liu,
>
> In our environment, the log files are rotated to other dir periodically. 
> We found the fd leak when using the taildir source because the taildir source 
> just handle the matched files, if the log files are rotated, they are not 
> existing in the matched file list ,so idleFileChecker doesn't clean them



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: REST API Proposal

2017-02-16 Thread Mike Percy
Thanks for sending this out, Tristan. Sorry for my late response. I added a
couple of comments to the doc. One missing piece is how do the
configuration changes get persisted (or do they?)

Due to other commitments it would be difficult for me to help out very
much. However, it may be possible for the PMC to grant access for a feature
branch to use for collaboration.

Mike


On Wed, Feb 1, 2017 at 12:09 AM, Tristan Stevens 
wrote:

> Hi all,
> I'd like to put forward a proposal that I've been considering based on
> conversations with users and on observations of some threads on this
> mailing list.
>
> My proposal is that we build into Flume a REST API that would give
> administrators greater control over a running instance of Flume. Basically
> I'm thinking of the following features:
>  - Allow browsing of status and configuration of all components (Sources,
> Channels, Sinks).
>  - Allow starting and stopping of individual components.
>  - Allow deployment of new components (Sources, Channels, Sinks) into a
> running agent.
>  - Allow modification of configuration of deployed components (Sources,
> Channels, Sinks).
>  - Allow modification of log4j configuration of a running instance
> (FLUME-3038).
>
> Overall long-term goal: Eliminate the need for routine administration of
> Flume via command-line during a dev-cycle and increase the
> supportability/administerability of Flume in general.
>
> In terms of benefits, my thinking is as follows:
>
>  - Granular visibility of component statuses.
>  - Graceful shutdown of agent (e.g. shut down Sources, allow Channels to
> drain, and then shut down Sinks) (I think there's a JIRA kicking around for
> this)
>  - Failure scenario management:
>- Enable re-pointing of Sinks (e.g. because of downstream issues)
> without interrupting Sources or losing events in Channels.
>- Re-configuring channels or sinks in order to improve performance.
>- Add sinks to running instance in order to relieve pressure on
> over-full channel.
>   - Improve developer experience by allowing for dynamic (re)configuration
> of agent without using the command-line and without needing to restart the
> whole process.
>  - Significantly lower the barrier to adoption for both developers and
> administrators.
>
> There is also the possibility that we could then support third party
> tooling for building interactive web UIs on top of Flume, which would
> greatly improve usability for both administrators and also developers (e.g.
> configurators).
>
> I've knocked together a bit of a design proposal which I've made available
> at:
> https://docs.google.com/document/d/1OKrX__YVfMMSInezgIOj53j6JYPKfI8mNF7R
> P1wIhA8/edit?usp=sharing
> Please
> add specific comments inline in the doc and general comments back to this
> thread.
>
> My question to the group is threefold:
>
> 1. Is this something that we think is a) worthwhile and b) achievable?
> 2. I'm happy to lead the development, but is there a committer who can
> offer time to review and support?
> 3. Would anyone else be interested in contributing features or testing?
>
> Many thanks,
> Tristan
>


[jira] [Commented] (FLUME-3055) Taildir source FD leaks when the matched files is renamed or rotated to other dir

2017-02-16 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870983#comment-15870983
 ] 

Mike Percy commented on FLUME-3055:
---

Hi [~huLiu], is this a real bug? Did you work around the problem?

> Taildir source FD leaks when the matched files is renamed or rotated to other 
> dir
> -
>
> Key: FLUME-3055
> URL: https://issues.apache.org/jira/browse/FLUME-3055
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Hu Liu,
>Assignee: Hu Liu,
>
> In our environment, the log files are rotated to other dir periodically. 
> We found the fd leak when using the taildir source because the taildir source 
> just handle the matched files, if the log files are rotated, they are not 
> existing in the matched file list ,so idleFileChecker doesn't clean them



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLUME-2427) java.lang.NoSuchMethodException and warning on HDFS (S3) sink

2017-02-16 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved FLUME-2427.
---
   Resolution: Fixed
Fix Version/s: v1.8.0

> java.lang.NoSuchMethodException and warning on HDFS (S3) sink 
> --
>
> Key: FLUME-2427
> URL: https://issues.apache.org/jira/browse/FLUME-2427
> Project: Flume
>  Issue Type: Question
>Reporter: Bijith Kumar
>Assignee: Ping Wang
>Priority: Minor
> Fix For: v1.8.0
>
> Attachments: FLUME-2427-0.patch
>
>
> The below warning and Exception is thrown every time a file is written to S3 
> using HDFS sink. Looks like a jar mismatch to me. Tried latest hadoop and 
> jets3 jars  but didn't work 
> 17 Jul 2014 23:30:18,159 INFO  [hdfs-s3sink-engagements-call-runner-6] 
> (org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:184)
>   - FileSystem's output stream doesn't support getNumCurrentReplicas; 
> --HDFS-826 not available; 
> fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
>  err=java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
> 17 Jul 2014 23:30:18,160 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed:210)  - isFileClosed 
> is not available in the version of HDFS being used. Flume will not attempt to 
> close files if the close fails on the first attempt
> java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.isFileClosed(org.apache.hadoop.fs.Path)
>   at java.lang.Class.getMethod(Class.java:1665)
>   at 
> org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed(BucketWriter.java:207)
>   at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:295)
>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:554)
>   at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLUME-2427) java.lang.NoSuchMethodException and warning on HDFS (S3) sink

2017-02-16 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy reassigned FLUME-2427:
-

Assignee: Mike Percy

> java.lang.NoSuchMethodException and warning on HDFS (S3) sink 
> --
>
> Key: FLUME-2427
> URL: https://issues.apache.org/jira/browse/FLUME-2427
> Project: Flume
>  Issue Type: Question
>Reporter: Bijith Kumar
>    Assignee: Mike Percy
>Priority: Minor
> Attachments: FLUME-2427-0.patch
>
>
> The below warning and Exception is thrown every time a file is written to S3 
> using HDFS sink. Looks like a jar mismatch to me. Tried latest hadoop and 
> jets3 jars  but didn't work 
> 17 Jul 2014 23:30:18,159 INFO  [hdfs-s3sink-engagements-call-runner-6] 
> (org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:184)
>   - FileSystem's output stream doesn't support getNumCurrentReplicas; 
> --HDFS-826 not available; 
> fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
>  err=java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
> 17 Jul 2014 23:30:18,160 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed:210)  - isFileClosed 
> is not available in the version of HDFS being used. Flume will not attempt to 
> close files if the close fails on the first attempt
> java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.isFileClosed(org.apache.hadoop.fs.Path)
>   at java.lang.Class.getMethod(Class.java:1665)
>   at 
> org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed(BucketWriter.java:207)
>   at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:295)
>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:554)
>   at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLUME-2427) java.lang.NoSuchMethodException and warning on HDFS (S3) sink

2017-02-16 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy reassigned FLUME-2427:
-

Assignee: Ping Wang  (was: Mike Percy)

> java.lang.NoSuchMethodException and warning on HDFS (S3) sink 
> --
>
> Key: FLUME-2427
> URL: https://issues.apache.org/jira/browse/FLUME-2427
> Project: Flume
>  Issue Type: Question
>Reporter: Bijith Kumar
>Assignee: Ping Wang
>Priority: Minor
> Attachments: FLUME-2427-0.patch
>
>
> The below warning and Exception is thrown every time a file is written to S3 
> using HDFS sink. Looks like a jar mismatch to me. Tried latest hadoop and 
> jets3 jars  but didn't work 
> 17 Jul 2014 23:30:18,159 INFO  [hdfs-s3sink-engagements-call-runner-6] 
> (org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:184)
>   - FileSystem's output stream doesn't support getNumCurrentReplicas; 
> --HDFS-826 not available; 
> fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
>  err=java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
> 17 Jul 2014 23:30:18,160 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed:210)  - isFileClosed 
> is not available in the version of HDFS being used. Flume will not attempt to 
> close files if the close fails on the first attempt
> java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.isFileClosed(org.apache.hadoop.fs.Path)
>   at java.lang.Class.getMethod(Class.java:1665)
>   at 
> org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed(BucketWriter.java:207)
>   at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:295)
>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:554)
>   at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-2427) java.lang.NoSuchMethodException and warning on HDFS (S3) sink

2017-02-16 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870967#comment-15870967
 ] 

Mike Percy commented on FLUME-2427:
---

+1 on [~wpwang]'s patch, I am also going to lower this to info() and tweak the 
message to say "version of distributed filesystem" instead of "version of HDFS".

> java.lang.NoSuchMethodException and warning on HDFS (S3) sink 
> --
>
> Key: FLUME-2427
> URL: https://issues.apache.org/jira/browse/FLUME-2427
> Project: Flume
>  Issue Type: Question
>Reporter: Bijith Kumar
>Priority: Minor
> Attachments: FLUME-2427-0.patch
>
>
> The below warning and Exception is thrown every time a file is written to S3 
> using HDFS sink. Looks like a jar mismatch to me. Tried latest hadoop and 
> jets3 jars  but didn't work 
> 17 Jul 2014 23:30:18,159 INFO  [hdfs-s3sink-engagements-call-runner-6] 
> (org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:184)
>   - FileSystem's output stream doesn't support getNumCurrentReplicas; 
> --HDFS-826 not available; 
> fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
>  err=java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
> 17 Jul 2014 23:30:18,160 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed:210)  - isFileClosed 
> is not available in the version of HDFS being used. Flume will not attempt to 
> close files if the close fails on the first attempt
> java.lang.NoSuchMethodException: 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.isFileClosed(org.apache.hadoop.fs.Path)
>   at java.lang.Class.getMethod(Class.java:1665)
>   at 
> org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed(BucketWriter.java:207)
>   at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:295)
>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:554)
>   at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-3042) use ganglia to monitoring flume KafkaChannel error

2017-02-16 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870956#comment-15870956
 ] 

Mike Percy commented on FLUME-3042:
---

Please report this to the Ganglia project... with an example of the output from 
the Flume metrics. They apparently have a buffer overflow. Then please report 
back to us if you think there is a bug on the Flume side.

> use ganglia to monitoring flume KafkaChannel error
> --
>
> Key: FLUME-3042
> URL: https://issues.apache.org/jira/browse/FLUME-3042
> Project: Flume
>  Issue Type: Bug
>  Components: Kafka Channel
>Affects Versions: v1.7.0
>Reporter: tiany
>
> when i used ganglia to monitor kafkachannel, it is being given as follows:
> There was an error collecting ganglia data (127.0.0.1:8652): fsockopen error: 
> Connection refused
> and find gmetad service was dead,debug as follows:
> *** buffer overflow detected ***: gmetad terminated
> === Backtrace: =
> /lib64/libc.so.6(__fortify_fail+0x37)[0x3cb14ff3f7]
> /lib64/libc.so.6[0x3cb14fd2e0]
> /lib64/libc.so.6[0x3cb14fc739]
> /lib64/libc.so.6(_IO_default_xsputn+0xc9)[0x3cb1473899]
> /lib64/libc.so.6(_IO_vfprintf+0x3826)[0x3cb1447516]
> /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3cb14fc7dd]
> /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3cb14fc71f]
> gmetad[0x40b098]
> gmetad[0x408d19]
> /usr/lib64/libganglia.so.0(hash_foreach+0x59)[0x7fe91d9bf639]
> gmetad[0x408a41]
> This problem occurs only when useing KafkaChannel,others is not occurs(ex: 
> memoryChannel、KafkaSink and so on) . why is there this problem? please help 
> me, thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLUME-3049) Wrapping the exception into SecurityException in UGIExecutor.execute hides the original one

2017-02-01 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved FLUME-3049.
---
   Resolution: Fixed
Fix Version/s: v1.8.0

Pushed to trunk. Thanks for the patch, Denes!

> Wrapping the exception into SecurityException in UGIExecutor.execute hides 
> the original one
> ---
>
> Key: FLUME-3049
> URL: https://issues.apache.org/jira/browse/FLUME-3049
> Project: Flume
>  Issue Type: Bug
>Reporter: Denes Arvay
>Assignee: Denes Arvay
> Fix For: v1.8.0
>
>
> see: 
> https://github.com/apache/flume/blob/trunk/flume-ng-auth/src/main/java/org/apache/flume/auth/UGIExecutor.java#L49
> This has unexpected side effects as the callers try to catch the wrapped 
> exception, for example in {{BucketWriter.append()}}: 
> https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L563
> Here IOException is considered as non-transient failure thus the {{close()}} 
> is called, but when the original exception is wrapped into 
> {{SecurityException}} it doesn't trigger the close of the file.
> Similarly in {{HDFSEventSink.process()}} method the `IOException` is handled 
> in a different way than any other exception. It might come from 
> {{BucketWriter.append()}} or {{BucketWriter.flush()}} for example, and both 
> of them invoke the hdfs call via a {{PrivilegedExecutor}} instance which 
> might be the problematic {{UGIExecutor}}.
> The bottom line is that the contract in {{PrivilegedExecutor.execute()}} is 
> that they shouldn't change the exception thrown in the business logic - at 
> least it's not indicated in its signature in any way. The default 
> implementation ({{SimpleAuthenticator}}) behaves according to this.
> I don't know the original intend behind this wrapping, [~jrufus] or 
> [~hshreedharan], do you happen to remember? (You were involved in the 
> original implementation in FLUME-2631)
> Right now I don't see any problem in removing this and letting the original 
> exception to propagate as the {{org.apache.flume.auth.SecurityException}} 
> doesn't appear anywhere in the public interface.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Moving Developer Section of cwiki to the git repository

2016-10-24 Thread Mike Percy
I don't think it would pollute the git log. I also don't think spamming the
dev list with PRs is a big deal... I use GMail filters so I can see the
human traffic... but maybe not everyone does that.

Anyway it's up to you how you want to organize it. :)

Mike

On Mon, Oct 24, 2016 at 4:39 PM, Balazs Donat Bessenyei <bes...@cloudera.com
> wrote:

> Thank you.
>
> Wouldn't committing (and PRing) them one-by-one pollute the git log and
> mailing lists too much?
>
> On Oct 24, 2016 5:26 PM, "Mike Percy" <mpe...@apache.org> wrote:
>
> > Feel free to do it per-file if you want. Looks good to me
> >
> > On Mon, Oct 24, 2016 at 4:14 PM, Balazs Donat Bessenyei <
> > bes...@cloudera.com
> > > wrote:
> >
> > > I have commenced moving the Developer Section to the git repo.
> > >
> > > There is a WIP pull request available at https://github.com/apache/
> > > flume/pull/77
> > >
> > > I'm planning to add more e-mail templates to the HowToRelease.md and
> > > commit all docs in a single commit, but reviews and comments are
> > > welcome in the meanwhile.
> > >
> > >
> > > Thank you,
> > >
> > > Donat
> > >
> > > On Fri, Oct 21, 2016 at 6:07 PM, Mike Percy <mpe...@apache.org> wrote:
> > > > Thanks, Donat. +1 on moving the docs.
> > > >
> > > > Regarding the web site source code, it can reside in svn or git
> however
> > > it
> > > > should continue using the standard ASF site infrastructure [1] for
> > > hosting
> > > > content which consists only of static rendered HTML content. Our
> > options
> > > > for pushing that rendered HTML are either svnpubsub[2] or
> gitpubsub[3]
> > > > which are basically the same thing - commit HTML and it gets pushed
> > live
> > > to
> > > > the web site.
> > > >
> > > > If someone wants to redesign the site or improve the site template, I
> > > would
> > > > certainly welcome that.
> > > >
> > > > As a comparison point with another ASF project I am involved in, on
> > > Apache
> > > > Kudu we use the software used for GitHub pages (Jekyll) to render the
> > > site.
> > > > The web site source code is in the gh-pages branch in Git [4]. The
> site
> > > is
> > > > mostly written in Markdown and we wrote a script [5] that locally
> > renders
> > > > the site to static HTML content and checks in the rendered content to
> > the
> > > > Kudu gitpubsub repository [6].
> > > >
> > > > In Flume, we basically do the same thing but but instead of Jekyll we
> > use
> > > > Sphinx and the site source code is primarily ReStructuredText. The
> site
> > > > lives in its own svn repo [7]. We push rendered HTML tent to an
> > svnpubsub
> > > > repository for Flume [8] which is controlled by the ASF CMS [9]...
> > IIRC,
> > > > the CMS takes care of invoking the rendering part instead of having a
> > > local
> > > > script to do it. This is kind of documented in the How to Release
> guide
> > > for
> > > > Flume [10].
> > > >
> > > > If someone just wanted to change the look & feel of the web site they
> > > could
> > > > easily do it with a little HTML/CSS knowledge and some reading up on
> > > > Sphinx. The web site template could be changed to something nicer
> (it's
> > > > using some kind of stock template right now). The existing .rst
> content
> > > > would not need to be modified at all, probably.
> > > >
> > > > If someone wanted to switch the whole web site to some other system,
> > like
> > > > Jekyll, obviously it's a bigger change but if they were determined
> > then I
> > > > don't think anyone would try to stop them, assuming it's an
> > improvement!
> > > >
> > > > Hope this helps,
> > > > Mike
> > > >
> > > > [1] https://www.apache.org/dev/project-site.html
> > > > [2] http://svn.apache.org/viewvc/subversion/trunk/tools/
> > > > server-side/svnpubsub/
> > > > [3] https://www.apache.org/dev/gitpubsub.html
> > > > [4] https://github.com/apache/kudu/tree/gh-pages
> > > > [5] https://github.com/apache/kudu/blob/gh-pages/site_tool
> > > > [6] https://git-wip-us.apache.org/repos/asf?p=kudu-site.git;a=
> > > > shortlog;h=refs/heads/a

Re: Moving Developer Section of cwiki to the git repository

2016-10-24 Thread Mike Percy
Feel free to do it per-file if you want. Looks good to me

On Mon, Oct 24, 2016 at 4:14 PM, Balazs Donat Bessenyei <bes...@cloudera.com
> wrote:

> I have commenced moving the Developer Section to the git repo.
>
> There is a WIP pull request available at https://github.com/apache/
> flume/pull/77
>
> I'm planning to add more e-mail templates to the HowToRelease.md and
> commit all docs in a single commit, but reviews and comments are
> welcome in the meanwhile.
>
>
> Thank you,
>
> Donat
>
> On Fri, Oct 21, 2016 at 6:07 PM, Mike Percy <mpe...@apache.org> wrote:
> > Thanks, Donat. +1 on moving the docs.
> >
> > Regarding the web site source code, it can reside in svn or git however
> it
> > should continue using the standard ASF site infrastructure [1] for
> hosting
> > content which consists only of static rendered HTML content. Our options
> > for pushing that rendered HTML are either svnpubsub[2] or gitpubsub[3]
> > which are basically the same thing - commit HTML and it gets pushed live
> to
> > the web site.
> >
> > If someone wants to redesign the site or improve the site template, I
> would
> > certainly welcome that.
> >
> > As a comparison point with another ASF project I am involved in, on
> Apache
> > Kudu we use the software used for GitHub pages (Jekyll) to render the
> site.
> > The web site source code is in the gh-pages branch in Git [4]. The site
> is
> > mostly written in Markdown and we wrote a script [5] that locally renders
> > the site to static HTML content and checks in the rendered content to the
> > Kudu gitpubsub repository [6].
> >
> > In Flume, we basically do the same thing but but instead of Jekyll we use
> > Sphinx and the site source code is primarily ReStructuredText. The site
> > lives in its own svn repo [7]. We push rendered HTML tent to an svnpubsub
> > repository for Flume [8] which is controlled by the ASF CMS [9]... IIRC,
> > the CMS takes care of invoking the rendering part instead of having a
> local
> > script to do it. This is kind of documented in the How to Release guide
> for
> > Flume [10].
> >
> > If someone just wanted to change the look & feel of the web site they
> could
> > easily do it with a little HTML/CSS knowledge and some reading up on
> > Sphinx. The web site template could be changed to something nicer (it's
> > using some kind of stock template right now). The existing .rst content
> > would not need to be modified at all, probably.
> >
> > If someone wanted to switch the whole web site to some other system, like
> > Jekyll, obviously it's a bigger change but if they were determined then I
> > don't think anyone would try to stop them, assuming it's an improvement!
> >
> > Hope this helps,
> > Mike
> >
> > [1] https://www.apache.org/dev/project-site.html
> > [2] http://svn.apache.org/viewvc/subversion/trunk/tools/
> > server-side/svnpubsub/
> > [3] https://www.apache.org/dev/gitpubsub.html
> > [4] https://github.com/apache/kudu/tree/gh-pages
> > [5] https://github.com/apache/kudu/blob/gh-pages/site_tool
> > [6] https://git-wip-us.apache.org/repos/asf?p=kudu-site.git;a=
> > shortlog;h=refs/heads/asf-site
> > [7] https://svn.apache.org/repos/asf/flume/site/trunk/
> > [8] https://svn.apache.org/repos/infra/websites/production/flume/
> > [9] https://www.apache.org/dev/cms.html
> > [10]
> > https://cwiki.apache.org/confluence/display/FLUME/How+
> to+Release#HowtoRelease-Updatethewebsite
> >
> > On Thu, Oct 20, 2016 at 2:59 PM, Balazs Donat Bessenyei <
> bes...@cloudera.com
> >> wrote:
> >
> >> As nobody has objected in ~48 hours, I'll open a pull request about this
> >> soon.
> >>
> >> Regarding the whole site thing: AFAIK, the site contents have to
> >> reside in SVN, because that's how ASF infrastructure works now.
> >>
> >> First, I'll work on moving the Developer Section to
> >> https://github.com/apache/flume , then we'll see what else we can do.
> >>
> >>
> >> Thank you,
> >>
> >> Donat
> >>
> >> On Thu, Oct 20, 2016 at 12:41 PM, UMESH CHAUDHARY <umesh9...@gmail.com>
> >> wrote:
> >> > +1 for moving wiki and web contents into GitHub. It would be easier to
> >> > manage. Also, +1 for improving our website.
> >> >
> >> > On Thu, 20 Oct 2016 at 15:42 Lior Zeno <liorz...@gmail.com> wrote:
> >> >
> >> > +1 for moving the whole site as well. I wish we could improve our
> >> f

Re: jira reference is missing from git commit messages

2016-10-23 Thread Mike Percy
Hi Attila,

On Fri, Oct 21, 2016 at 7:51 PM, Attila Simon  wrote:

> I have no strong opinion on whether we should have a jira or not (thus
> the proposal of FLUME-PR70). I just "I found this habit very useful".
>

It seems kind of obtrusive to me to require committers to use this kind of
pattern for a PR, to be honest, when the "Closes #70" thing is already
there.

Something that Gerrit does is it automatically adds something equivalent to:

  Reviewed-on: https://github.com/apache/flume/pull/70

to the bottom of the commit message. Maybe it would be useful to have
people include this when committing PRs? However, it's particular enough
that we should probably write a shell script to automate it so committers
don't have to remember (or adapt one of the scripts used by the Spark
project, maybe).

Would that help with the problem you are facing?


> I have tooling which depends on the commit titles are unique with high
> probability. Most likely these tools can be upgraded with some extra
> effort.
>

I think we should try to make it as easy as possible for downstream
distributors to consume Flume. Still, I wonder if that tool you're using
just sucks. Let me know how I can help you deal with the downstream stuff
as I wonder if there aren't better ways of solving this problem.

Maybe having jira for each change is an overkill for small changes.
> What do you think what can be considered as a small change? eg the
> ones for which the "how to contribute" guide doesn't require review
> board?
>

In the past we have used JIRA to track patch submission, comments, and code
reviews. GitHub Pull Requests encompass all of those things, so I don't see
any reason to also use JIRA when submitting a PR, except when the PR fixes
an issue or implements a feature that has a JIRA filed against it.

I just don't see value to the project in requiring someone submitting a PR
to additionally file a JIRA. I think JIRA is mostly useful as a way to
track outstanding bugs and future work.

I know that different people have different views on this issue and I don't
want to impose my preferences on everyone working on Flume. I'd like to get
your thoughts on the above and I'd also like to hear from others on the
topic. I hope that we can standardize on a process that works well for
everyone.

Mike


Re: Enabling Travis-CI on Flume

2016-10-21 Thread Mike Percy
Hi Lior!

No, my message was not directed at you, or any person in particular. I
intended this message for those paying attention to this topic to try and
set expectations for how decision making for things like this usually (and
hopefully) works in Apache: If you are willing to do the work to get
something done, then it will probably get done the way you want!

(Assuming the end result is something that others in the community want --
in this case, it's very basic pre-commit checks. I think we have consensus
that pre-commit checks are something that would be a net benefit to
everyone.)

Sorry for any confusion! And don't let me stop people from voicing their
views and concerns.

Best,
Mike

On Fri, Oct 21, 2016 at 4:44 PM, Lior Zeno <liorz...@gmail.com> wrote:

> Mike, I was not holding Donat back. I was just suggesting ways to configure
> Jenins, per Donat's request. I'm sorry if my former post delivered the
> wrong message.
>
> On Fri, Oct 21, 2016 at 6:29 PM, Mike Percy <mpe...@apache.org> wrote:
>
> > Personally I prefer Jenkins over TravisCI for various reasons however if
> > Donat is willing to do the work of adding pre-commit checks on PRs via
> > Travis then I say let him do it, in the Apache spirit of "let they that
> do
> > the work make the decisions".
> >
> > If someone actually spends the time to set up Jenkins and configure it to
> > do the same thing, then great, let's switch when it's ready.
> >
> > Note that only ASF committers have access to Jenkins so non-committers
> will
> > need to work with a committer to get it done if they want to help.
> >
> > Mike
> >
> > On Fri, Oct 21, 2016 at 3:46 PM, Lior Zeno <liorz...@gmail.com> wrote:
> >
> > > There are many ways to do it, for example:
> > > https://www.theguild.nl/building-github-pull-requests-using-
> > > jenkins-pipelines/
> > > or https://www.theguild.nl/building-github-pull-requests-with-jenkins/
> > for
> > > earlier versions of Jenkins.
> > > I do not really care if it would be Jenkins or Travis, but I do think
> > that
> > > we can get Jenkins configured faster since we already have it. I can
> help
> > > with the configuration.
> > >
> > > On Fri, Oct 21, 2016 at 5:17 PM, Balazs Donat Bessenyei <
> > > bes...@cloudera.com
> > > > wrote:
> > >
> > > > As I haven't received any objections to enabling Travis, I'm going to
> > > > ask INFRA to enable it for Flume soon.
> > > >
> > > > This change would help submitting and reviewing pull requests.
> > > >
> > > > If someone figures out how we could use Jenkins for this purpose, we
> > > > can always disable Travis.
> > > >
> > > > PS. there are more projects using Travis:
> > > > https://issues.apache.org/jira/browse/INFRA-12757?jql=
> > > > project%20%3D%20INFRA%20AND%20text%20~%20travis%20ORDER%
> > > > 20BY%20updated%20DESC
> > > >
> > > > On Fri, Oct 14, 2016 at 5:41 PM, Attila Simon <s...@cloudera.com>
> > wrote:
> > > > > Denes I'm happy to help you in this endeavor of setting up jenkins
> > job
> > > > for
> > > > > verifying pull requests.
> > > > >
> > > > >
> > > > > *Attila Simon*
> > > > > Software Engineer
> > > > > Email:   s...@cloudera.com
> > > > >
> > > > > [image: Cloudera Inc.]
> > > > >
> > > > > On Fri, Oct 14, 2016 at 2:47 PM, Denes Arvay <de...@cloudera.com>
> > > wrote:
> > > > >
> > > > >> I'd also vote for Jenkins with github PRs.
> > > > >> I just checked Mesos and the PRs are checked by Travis, or at
> least
> > > they
> > > > >> experienced with it, there's a short discussion regarding to
> Travis
> > at
> > > > >> https://github.com/apache/mesos/pull/165
> > > > >>
> > > > >> As for the jenkins pull request job I'd be happy to set it up or
> > help
> > > > >> setting it up.
> > > > >>
> > > > >> Denes
> > > > >>
> > > > >> On Fri, Oct 14, 2016 at 2:15 PM Lior Zeno <liorz...@gmail.com>
> > wrote:
> > > > >>
> > > > >> Are we switching to PRs from patches + RB? In Apache Mesos, they
> > have
> > > a
> > > > >>
> > > > >> review bot that can leave 

Re: Moving Developer Section of cwiki to the git repository

2016-10-21 Thread Mike Percy
Thanks, Donat. +1 on moving the docs.

Regarding the web site source code, it can reside in svn or git however it
should continue using the standard ASF site infrastructure [1] for hosting
content which consists only of static rendered HTML content. Our options
for pushing that rendered HTML are either svnpubsub[2] or gitpubsub[3]
which are basically the same thing - commit HTML and it gets pushed live to
the web site.

If someone wants to redesign the site or improve the site template, I would
certainly welcome that.

As a comparison point with another ASF project I am involved in, on Apache
Kudu we use the software used for GitHub pages (Jekyll) to render the site.
The web site source code is in the gh-pages branch in Git [4]. The site is
mostly written in Markdown and we wrote a script [5] that locally renders
the site to static HTML content and checks in the rendered content to the
Kudu gitpubsub repository [6].

In Flume, we basically do the same thing but but instead of Jekyll we use
Sphinx and the site source code is primarily ReStructuredText. The site
lives in its own svn repo [7]. We push rendered HTML tent to an svnpubsub
repository for Flume [8] which is controlled by the ASF CMS [9]... IIRC,
the CMS takes care of invoking the rendering part instead of having a local
script to do it. This is kind of documented in the How to Release guide for
Flume [10].

If someone just wanted to change the look & feel of the web site they could
easily do it with a little HTML/CSS knowledge and some reading up on
Sphinx. The web site template could be changed to something nicer (it's
using some kind of stock template right now). The existing .rst content
would not need to be modified at all, probably.

If someone wanted to switch the whole web site to some other system, like
Jekyll, obviously it's a bigger change but if they were determined then I
don't think anyone would try to stop them, assuming it's an improvement!

Hope this helps,
Mike

[1] https://www.apache.org/dev/project-site.html
[2] http://svn.apache.org/viewvc/subversion/trunk/tools/
server-side/svnpubsub/
[3] https://www.apache.org/dev/gitpubsub.html
[4] https://github.com/apache/kudu/tree/gh-pages
[5] https://github.com/apache/kudu/blob/gh-pages/site_tool
[6] https://git-wip-us.apache.org/repos/asf?p=kudu-site.git;a=
shortlog;h=refs/heads/asf-site
[7] https://svn.apache.org/repos/asf/flume/site/trunk/
[8] https://svn.apache.org/repos/infra/websites/production/flume/
[9] https://www.apache.org/dev/cms.html
[10]
https://cwiki.apache.org/confluence/display/FLUME/How+to+Release#HowtoRelease-Updatethewebsite

On Thu, Oct 20, 2016 at 2:59 PM, Balazs Donat Bessenyei  wrote:

> As nobody has objected in ~48 hours, I'll open a pull request about this
> soon.
>
> Regarding the whole site thing: AFAIK, the site contents have to
> reside in SVN, because that's how ASF infrastructure works now.
>
> First, I'll work on moving the Developer Section to
> https://github.com/apache/flume , then we'll see what else we can do.
>
>
> Thank you,
>
> Donat
>
> On Thu, Oct 20, 2016 at 12:41 PM, UMESH CHAUDHARY 
> wrote:
> > +1 for moving wiki and web contents into GitHub. It would be easier to
> > manage. Also, +1 for improving our website.
> >
> > On Thu, 20 Oct 2016 at 15:42 Lior Zeno  wrote:
> >
> > +1 for moving the whole site as well. I wish we could improve our
> frontend
> > to become a bit more appealing.
> >
> > On Thu, Oct 20, 2016 at 11:53 AM, Attila Simon 
> wrote:
> >
> >> +1
> >> Essentially moving whole site (all web content) to scm would help
> >> contribution.
> >>
> >> Cheers,
> >> Attila
> >>
> >> On Tuesday, 18 October 2016, Balazs Donat Bessenyei <
> bes...@cloudera.com>
> >> wrote:
> >>
> >> > Hi All,
> >> >
> >> > As it's kind of difficult to get permissions in the wiki to edit pages
> >> > like https://cwiki.apache.org/confluence/display/FLUME/How+to+
> Contribute
> >> > , I suggest moving contents to the git repository into files like
> >> > CONTRIBUTING.md, etc.
> >> >
> >> > I'd be happy to create the files and move the current texts.
> >> >
> >> > Please, let me know your thoughts.
> >> >
> >> >
> >> > Thank you,
> >> >
> >> > Donat
> >> >
> >>
> >>
> >> --
> >>
> >> *Attila Simon*
> >> Software Engineer
> >> Email:   s...@cloudera.com
> >>
> >> [image: Cloudera Inc.]
> >>
>


Re: Enabling Travis-CI on Flume

2016-10-21 Thread Mike Percy
Personally I prefer Jenkins over TravisCI for various reasons however if
Donat is willing to do the work of adding pre-commit checks on PRs via
Travis then I say let him do it, in the Apache spirit of "let they that do
the work make the decisions".

If someone actually spends the time to set up Jenkins and configure it to
do the same thing, then great, let's switch when it's ready.

Note that only ASF committers have access to Jenkins so non-committers will
need to work with a committer to get it done if they want to help.

Mike

On Fri, Oct 21, 2016 at 3:46 PM, Lior Zeno  wrote:

> There are many ways to do it, for example:
> https://www.theguild.nl/building-github-pull-requests-using-
> jenkins-pipelines/
> or https://www.theguild.nl/building-github-pull-requests-with-jenkins/ for
> earlier versions of Jenkins.
> I do not really care if it would be Jenkins or Travis, but I do think that
> we can get Jenkins configured faster since we already have it. I can help
> with the configuration.
>
> On Fri, Oct 21, 2016 at 5:17 PM, Balazs Donat Bessenyei <
> bes...@cloudera.com
> > wrote:
>
> > As I haven't received any objections to enabling Travis, I'm going to
> > ask INFRA to enable it for Flume soon.
> >
> > This change would help submitting and reviewing pull requests.
> >
> > If someone figures out how we could use Jenkins for this purpose, we
> > can always disable Travis.
> >
> > PS. there are more projects using Travis:
> > https://issues.apache.org/jira/browse/INFRA-12757?jql=
> > project%20%3D%20INFRA%20AND%20text%20~%20travis%20ORDER%
> > 20BY%20updated%20DESC
> >
> > On Fri, Oct 14, 2016 at 5:41 PM, Attila Simon  wrote:
> > > Denes I'm happy to help you in this endeavor of setting up jenkins job
> > for
> > > verifying pull requests.
> > >
> > >
> > > *Attila Simon*
> > > Software Engineer
> > > Email:   s...@cloudera.com
> > >
> > > [image: Cloudera Inc.]
> > >
> > > On Fri, Oct 14, 2016 at 2:47 PM, Denes Arvay 
> wrote:
> > >
> > >> I'd also vote for Jenkins with github PRs.
> > >> I just checked Mesos and the PRs are checked by Travis, or at least
> they
> > >> experienced with it, there's a short discussion regarding to Travis at
> > >> https://github.com/apache/mesos/pull/165
> > >>
> > >> As for the jenkins pull request job I'd be happy to set it up or help
> > >> setting it up.
> > >>
> > >> Denes
> > >>
> > >> On Fri, Oct 14, 2016 at 2:15 PM Lior Zeno  wrote:
> > >>
> > >> Are we switching to PRs from patches + RB? In Apache Mesos, they have
> a
> > >>
> > >> review bot that can leave a comment on the patch, we could try and
> port
> > it
> > >>
> > >> to Flume. I think they use Jenkins too.
> > >>
> > >>
> > >>
> > >> On Fri, Oct 14, 2016 at 3:11 PM, Balazs Donat Bessenyei <
> > >> bes...@cloudera.com
> > >>
> > >> > wrote:
> > >>
> > >>
> > >>
> > >> > If the same function can be achieved with Jenkins and it's easy
> > >>
> > >> > (+quick) to set up, I'm totally happy with that.
> > >>
> > >> >
> > >>
> > >> > What do we have to do to enable Jenkins builds on PR-s?
> > >>
> > >> >
> > >>
> > >> > On Fri, Oct 14, 2016 at 2:05 PM, Lior Zeno 
> > wrote:
> > >>
> > >> > > There are ways to do the same with Jenkins, for instance, see this
> > SO
> > >>
> > >> > > thread
> > >>
> > >> > > http://stackoverflow.com/questions/37661602/how-to-set-
> > >>
> > >> > up-a-github-pull-request-build-in-a-jenkinsfile
> > >>
> > >> > >
> > >>
> > >> > > On Fri, Oct 14, 2016 at 11:09 AM, Balazs Donat Bessenyei <
> > >>
> > >> > > bes...@cloudera.com> wrote:
> > >>
> > >> > >
> > >>
> > >> > >> My primary reason for Travis (vs. Jenkins) was that I have
> > experience
> > >>
> > >> > with
> > >>
> > >> > >> it.
> > >>
> > >> > >>
> > >>
> > >> > >> And it leaves these happy little checkmarks:
> > >>
> > >> > >> https://github.com/sebastianbergmann/phpunit/pull/1051/commits
> on
> > the
> > >>
> > >> > >> commits and messages as seen on
> > >>
> > >> > >> https://github.com/apache/hive/pull/107 .
> > >>
> > >> > >>
> > >>
> > >> > >> Jenkins is probably configurable to achieve similar function.
> > However,
> > >>
> > >> > >> I have no idea how to do such. (And could not find an example
> when
> > I
> > >>
> > >> > >> did a quick search.)
> > >>
> > >> > >>
> > >>
> > >> > >> Are there any disadvantages of enabling Travis on Flume?
> > >>
> > >> > >>
> > >>
> > >> > >>
> > >>
> > >> > >> Thank you,
> > >>
> > >> > >>
> > >>
> > >> > >> Donat
> > >>
> > >> > >>
> > >>
> > >> > >> On Thu, Oct 13, 2016 at 6:06 PM, Lior Zeno 
> > >> wrote:
> > >>
> > >> > >> > Jenkins can do PRs as well. If we can upgrade Jenkins to 2.0,
> we
> > >> will
> > >>
> > >> > be
> > >>
> > >> > >> > able to define the build step via Jenkinsfile which becomes
> very
> > >>
> > >> > similar
> > >>
> > >> > >> to
> > >>
> > >> > >> > Travis.
> > >>
> > >> > >> > Is there any reason to prefer Travis over Jenkins in our 

Re: jira reference is missing from git commit messages

2016-10-21 Thread Mike Percy
Hi Attila,
Thanks for raising this concern of yours. Please see inline.

On Fri, Oct 21, 2016 at 12:17 PM, Attila Simon  wrote:

> The how to contribute page asks to have a jira first for each change. I
> guess with allowing pull requests we have to update the how to contribute
> page as well (which only describes attaching patch and using review board).
>

Yes, it appeared that we had consensus for allowing PR on a couple separate
dev@ threads a while back [1], [2].

The Flume contributor docs are currently a bit stale. I have recently
updated the How to Release cwiki page but if you want to volunteer to
update some of the other docs that would be very welcome! Please let me
know if you want to help and I can give you cwiki edit access if you send
me your account id. Side note: Donat has recently proposed moving those
docs from cwiki to the Git repo which I think would be a big improvement.

I would like to ask committers to reestablish the habit of having a jira
> for each commit and start the message with that jira.
>

Forcing people to file a JIRA when they are already doing a PR feels like
pointless extra paperwork to me. There is certainly a place for JIRA in a
software project, but I think that is to track unfixed bugs, ongoing tasks
/ projects, etc.


> If we would like to relax the have jira for each change (for pull requests)
> then I would suggest putting the request id as the first thing in the
> commit message.
>

Why?

If you click on this:
https://github.com/apache/flume/commit/87d4c2c13862144eb578b211bcf800b2206834ff

You will see that the text "Closes #70" creates a clickable hyperlink to
the PR. Is this not sufficient for tracking purposes?

Mike

[1] https://s.apache.org/k31f
[2] https://s.apache.org/Skm2


Re: [ANNOUNCE] Apache Flume 1.7.0 released

2016-10-19 Thread Mike Percy
Thank you for all your hard work RMing this release Donat and getting it to
the finish line. It was my pleasure to help out!

Best,
Mike

On Tue, Oct 18, 2016 at 1:32 PM, Balazs Donat Bessenyei <bes...@cloudera.com
> wrote:

> And thank you, Mike Percy for the mentoring and tremendous amounts of
> assistance with the release!
>
> On Tue, Oct 18, 2016 at 12:46 PM, Balazs Donat Bessenyei
> <bes...@cloudera.com> wrote:
> > Thank you all very much who participated and helped with the release!
> >
> >
> > Donat
> >
> >
> > On Tue, Oct 18, 2016 at 12:37 PM, Mike Percy <mpe...@apache.org> wrote:
> >> Woot! Congrats everyone!
> >>
> >> Thanks Donat for working so hard to get this version of Flume out the
> door!
> >>
> >> Best,
> >> Mike
> >>
> >>
> >> On Tue, Oct 18, 2016 at 10:09 AM, Bessenyei Balázs Donát <
> bes...@apache.org>
> >> wrote:
> >>>
> >>> The Apache Flume team is pleased to announce the release of Flume
> >>> version 1.7.0.
> >>>
> >>> Flume is a distributed, reliable, and available service for efficiently
> >>> collecting, aggregating, and moving large amounts of log data.
> >>>
> >>> This release can be downloaded from the Flume download page at:
> >>> http://flume.apache.org/download.html
> >>>
> >>> The change log and documentation are available on the 1.7.0 release
> page:
> >>> http://flume.apache.org/releases/1.7.0.html
> >>>
> >>> Your help and feedback is more than welcome. For more information on
> how
> >>> to report problems and to get involved, visit the project website at
> >>> http://flume.apache.org/
> >>>
> >>> The Apache Flume Team
> >>
> >>
>


Re: [ANNOUNCE] Apache Flume 1.7.0 released

2016-10-18 Thread Mike Percy
Woot! Congrats everyone!

Thanks Donat for working so hard to get this version of Flume out the door!

Best,
Mike

On Tue, Oct 18, 2016 at 10:09 AM, Bessenyei Balázs Donát 
wrote:

> The Apache Flume team is pleased to announce the release of Flume
> version 1.7.0.
>
> Flume is a distributed, reliable, and available service for efficiently
> collecting, aggregating, and moving large amounts of log data.
>
> This release can be downloaded from the Flume download page at:
> http://flume.apache.org/download.html
>
> The change log and documentation are available on the 1.7.0 release page:
> http://flume.apache.org/releases/1.7.0.html
>
> Your help and feedback is more than welcome. For more information on how
> to report problems and to get involved, visit the project website at
> http://flume.apache.org/
>
> The Apache Flume Team
>


Re: [RESULT] Flume 1.7.0 release vote

2016-10-17 Thread Mike Percy
I just pushed the RC2 artifacts to dist and the staged Maven repo to
central.

it's best to wait 24 hours for the bits to propagate to the dist mirrors
before announcing the release, so users don't get confused when the
download links don't work for their local mirror yet.

Mike

On Mon, Oct 17, 2016 at 2:20 PM, Balazs Donat Bessenyei <bes...@cloudera.com
> wrote:

> Thank you for the help, Mike!
>
> On Mon, Oct 17, 2016 at 1:53 PM, Mike Percy <mpe...@apache.org> wrote:
> > Thanks for running the vote, Donat!
> >
> > I will help to deploy the staged artifacts.
> >
> > Mike
> >
> > On Mon, Oct 17, 2016 at 10:57 AM, Balazs Donat Bessenyei <
> > bes...@cloudera.com> wrote:
> >
> >> The release vote for Apache Flume 1.7.0 has been completed in this
> >> thread: https://lists.apache.org/thread.html/
> >> 3cf114125dbea6e662b69d1312c34184690af327900e8dcccea5edde@%
> >> 3Cdev.flume.apache.org%3E
> >>
> >> The vote has received 3 binding +1 votes from the following PMC members:
> >> Mike Percy
> >> Hari Shreedharan
> >> Brock Noland
> >>
> >> Four non-binding +1 votes were received from:
> >> Denes Arvay
> >> Bessenyei Balázs Donát
> >> Lior Zeno
> >> Attila Simon
> >>
> >> No +0 or -1 votes were received.
> >>
> >> Since three +1 votes were received from the PMC with no -1 votes, the
> >> vote passes and Flume 1.7.0 RC2 will be promoted to the Flume 1.7.0
> >> release.
> >>
> >> Thanks to all who voted!
> >>
> >>
> >> Thank you,
> >>
> >> Donat
> >>
>


Re: [RESULT] Flume 1.7.0 release vote

2016-10-17 Thread Mike Percy
Thanks for running the vote, Donat!

I will help to deploy the staged artifacts.

Mike

On Mon, Oct 17, 2016 at 10:57 AM, Balazs Donat Bessenyei <
bes...@cloudera.com> wrote:

> The release vote for Apache Flume 1.7.0 has been completed in this
> thread: https://lists.apache.org/thread.html/
> 3cf114125dbea6e662b69d1312c34184690af327900e8dcccea5edde@%
> 3Cdev.flume.apache.org%3E
>
> The vote has received 3 binding +1 votes from the following PMC members:
> Mike Percy
> Hari Shreedharan
> Brock Noland
>
> Four non-binding +1 votes were received from:
> Denes Arvay
> Bessenyei Balázs Donát
> Lior Zeno
> Attila Simon
>
> No +0 or -1 votes were received.
>
> Since three +1 votes were received from the PMC with no -1 votes, the
> vote passes and Flume 1.7.0 RC2 will be promoted to the Flume 1.7.0
> release.
>
> Thanks to all who voted!
>
>
> Thank you,
>
> Donat
>


Re: Improving Flume distribution packaging

2016-10-14 Thread Mike Percy
They might be interested in discussing Docker images on dev@bigtop.a.o ...

Mike

On Friday, October 14, 2016, Lior Zeno <liorz...@gmail.com> wrote:

> I was not aware of BigTop, I do not think we should duplicate their work
> here.
> Maybe they will publish docker images as well in the future.
>
> On Fri, Oct 14, 2016 at 11:30 AM, Mike Percy <mpe...@apache.org
> <javascript:;>> wrote:
>
> > Currently rpms are handled by downstream projects like Apache BigTop as
> > well as commercial distributions. So in my view it's a duplication of
> > efforts there, particularly since BigTop and distributions based on it
> also
> > harmonize dependencies to a certain extent and we mark many of our
> > dependencies as "provided".
> >
> > So it's possible but not sure it's worth the effort since Flume depends
> on
> > so many other projects like Hadoop, HBase, etc
> >
> > Mike
> >
> > On Fri, Oct 14, 2016 at 9:36 AM, Lior Zeno <liorz...@gmail.com
> <javascript:;>> wrote:
> >
> > > Hi All,
> > >
> > > Currently, we distribute Flume via a tarball. I think we should improve
> > and
> > > also publish deb and rpm packages that allow running Flume as a
> service.
> > >
> > > More over, container technology, such as Docker, has become
> increasingly
> > > popular in last few years. Publishing an official Docker image will
> allow
> > > easier deployment of Flume in the cloud.
> > >
> > > How do you guys feel about this?
> > >
> > > Thanks
> > >
> >
>


Re: Improving Flume distribution packaging

2016-10-14 Thread Mike Percy
Currently rpms are handled by downstream projects like Apache BigTop as
well as commercial distributions. So in my view it's a duplication of
efforts there, particularly since BigTop and distributions based on it also
harmonize dependencies to a certain extent and we mark many of our
dependencies as "provided".

So it's possible but not sure it's worth the effort since Flume depends on
so many other projects like Hadoop, HBase, etc

Mike

On Fri, Oct 14, 2016 at 9:36 AM, Lior Zeno  wrote:

> Hi All,
>
> Currently, we distribute Flume via a tarball. I think we should improve and
> also publish deb and rpm packages that allow running Flume as a service.
>
> More over, container technology, such as Docker, has become increasingly
> popular in last few years. Publishing an official Docker image will allow
> easier deployment of Flume in the cloud.
>
> How do you guys feel about this?
>
> Thanks
>


Re: [VOTE] Release Apache Flume version 1.7.0 RC2

2016-10-13 Thread Mike Percy
+1 (binding)

There are some flaky tests which are listed below but I don't think they
are release blockers.

I performed the following checks:

Binary convenience artifact:
* Signature and checksums match
* LICENSE, NOTICE, and README.md files in the binary convenience artifact
look accurate and complete relative to the jars in lib/
* Ran a very quick test with the binary artifact and it
worked: ./bin/flume-ng agent -c conf -f
conf/flume-conf.properties.template -n
agent -Dflume.root.logger=DEBUG,console
* Checked that the documentation in docs/ renders: Flume User Guide and
Flume Dev Guide are OK. Also spot-checked that the new Kafka security
documentation was included in the User Guide

Source artifact:
* Signature and checksums match
* Built Flume from the source artifact using Oracle 1.7.0_80 on Ubuntu
Linux 16.04, sanity tested the resulting binary using the above method and
it worked
* RAT checks passed
* Built a new source artifact out of the official source artifact and
compiled it
* I ran the unit tests. Most passed but the below two failed. These are
flaky tests (we have a bunch of them in Flume) so I think it's fine not to
block the release on them.
  * TestExecSource.testMonitoredCounterGroup - looks like a racy test
  * TestSpillableMemoryChannel - didn't investigate

RC2 looks good to me.

Thanks for running this release, Donat!

Mike

On Wed, Oct 12, 2016 at 9:29 PM, Balazs Donat Bessenyei  wrote:

> Hi All,
>
> This is the tenth release for Apache Flume as a top-level project,
> version 1.7.0. We are voting on release candidate RC2.
>
> It fixes the following issues:
>   https://raw.githubusercontent.com/apache/flume/flume-1.7/CHANGELOG
>
> *** Please cast your vote within the next 72 hours ***
>
> The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> for the source and binary artifacts can be found here:
>   http://people.apache.org/~bessbd/apache-flume-1.7.0-rc2/
>
> Maven staging repo:
>   https://repository.apache.org/content/repositories/orgapacheflume-1020/
>
> The tag to be voted on:
>   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=511d868
>
> Flume's KEYS file containing PGP keys we use to sign the release:
>   https://www.apache.org/dist/flume/KEYS
>
>
> Thank you,
>
> Donat
>


Re: [VOTE] Release Apache Flume version 1.7.0 RC1

2016-10-12 Thread Mike Percy
Thanks Attila. I just submitted PR #70 to fix the issue with the test libs
in the binary artifact classpath.

Mike

On Wed, Oct 12, 2016 at 6:39 PM, Attila Simon <s...@cloudera.com> wrote:

> Thanks Mike,
>
> I just created a pull request for this change:
> https://github.com/apache/flume/pull/68
> Regarding to the jackson-core|mapper-asl: it only has versions up to
> 1.9.13 as of now. Thus specifying the 1. in the license is confusing
> for me. "-asl" is already in the name and that is the important
> information.
>
> Cheers,
> Attila
>
> On Wed, Oct 12, 2016 at 6:13 PM, Mike Percy <mpe...@apache.org> wrote:
> > On Wed, Oct 12, 2016 at 5:07 PM, Attila Simon <s...@cloudera.com> wrote:
> >
> >> Additional libraries shipped but not mentioned in the licence file:
> >> hamcrest-core (BSD License), junit (BSD License), xz (
> >> http://git.tukaani.org/?p=xz.git;a=blob;f=COPYING),
> >>
> >
> > Regarding xz it looks like while the main xz program has some weird mixed
> > license, the xz-java library (which is what we are using) is in the
> public
> > domain: http://git.tukaani.org/?p=xz-java.git;a=blob_plain;f=
> COPYING;hb=HEAD
> >
> > Consulting the official ASF legal guidelines found at
> > https://www.apache.org/legal/resolved we can see that public domain
> works
> > are allowed to be included in ASF software releases, but attribution may
> be
> > needed. Simply adding the text of the xz-java COPYING file to the Flume
> > LICENSE file verbatim includes copyright and attribution so that should
> be
> > sufficient. Looking at other ASF projects that include public-domain
> works,
> > that seems to be the approach used and they don't additionally include
> such
> > attribution in their NOTICE files.
> >
> > Libraries which require little correction in the name in the licence
> file:
> >> jackson-core-asl-1..jar
> >> jackson-mapper-asl-1..jar
> >>
> >
> > This was done on purpose to make it obvious that we are using the
> > asl-licensed versions of jackson-1. jackson-1 is dual-licensed which the
> > ASF doesn't like, whereas jackson-2 is ASL 2.0.
> >
> > Suggested actions:
> >>  1) hamcrest-core, junit these are testing libs, I think shouldn't
> >> be shipped (only required for tests, I guess they were added by the
> >> new .*shared.* module)
> >>
> >
> > +1, the artifacts from kafka-shared-test should not be shipped in the
> > binary artifact
> >
> >
> >>  2) kafka-clients, lz4 should be added to our LICENSE file
> >>
> >
> > +1
> >
> >
> >>  3) jackson.* fix the typo in the LICENSE file
> >>
> >
> > I would rather we not change this per my comment above, unless you think
> > it's very confusing
> >
> >  4) zookeeper since we depend on zkclient instead we should remove
> >> this from the LICENSE file.
> >>
> >
> > +1 to remove this entry from the LICENSE file now that it's no longer a
> > transitive dependency.
> >
> > Mike
>


Re: [VOTE] Release Apache Flume version 1.7.0 RC1

2016-10-12 Thread Mike Percy
On Wed, Oct 12, 2016 at 5:07 PM, Attila Simon  wrote:

> Additional libraries shipped but not mentioned in the licence file:
> hamcrest-core (BSD License), junit (BSD License), xz (
> http://git.tukaani.org/?p=xz.git;a=blob;f=COPYING),
>

Regarding xz it looks like while the main xz program has some weird mixed
license, the xz-java library (which is what we are using) is in the public
domain: http://git.tukaani.org/?p=xz-java.git;a=blob_plain;f=COPYING;hb=HEAD

Consulting the official ASF legal guidelines found at
https://www.apache.org/legal/resolved we can see that public domain works
are allowed to be included in ASF software releases, but attribution may be
needed. Simply adding the text of the xz-java COPYING file to the Flume
LICENSE file verbatim includes copyright and attribution so that should be
sufficient. Looking at other ASF projects that include public-domain works,
that seems to be the approach used and they don't additionally include such
attribution in their NOTICE files.

Libraries which require little correction in the name in the licence file:
> jackson-core-asl-1..jar
> jackson-mapper-asl-1..jar
>

This was done on purpose to make it obvious that we are using the
asl-licensed versions of jackson-1. jackson-1 is dual-licensed which the
ASF doesn't like, whereas jackson-2 is ASL 2.0.

Suggested actions:
>  1) hamcrest-core, junit these are testing libs, I think shouldn't
> be shipped (only required for tests, I guess they were added by the
> new .*shared.* module)
>

+1, the artifacts from kafka-shared-test should not be shipped in the
binary artifact


>  2) kafka-clients, lz4 should be added to our LICENSE file
>

+1


>  3) jackson.* fix the typo in the LICENSE file
>

I would rather we not change this per my comment above, unless you think
it's very confusing

 4) zookeeper since we depend on zkclient instead we should remove
> this from the LICENSE file.
>

+1 to remove this entry from the LICENSE file now that it's no longer a
transitive dependency.

Mike


Re: [VOTE] Release Apache Flume version 1.7.0 RC1

2016-10-12 Thread Mike Percy
-1 for RC1

I've found a few problems with RC1:

1. As mentioned by Donat, the signed source artifact was corrupted somehow,
I am guessing the maven assembly plugin had some problem at build / deploy
time
2. I locally built a source tarball from the RC1 tag and compared the file
contents with the git tag. I noticed the following discrepancies in the
resulting source artifact:
* README.md is missing from the source tarball
* doap_Flume.rdf is missing from the source tarball
* The source tarball contains .iml files from my local dev environment
3. I checked that the binary artifact starts up with a simple configuration
file. I checked the LICENSE file and found the discrepancies that Attila
found as well.

Personally I don't like using the maven-assembly-plugin to generate the
source artifact, since it requires maintenance. IMHO it would be better to
replace it with a small shell script that invoked git archive against the
release tag, but that is not a release blocker.

Mike

On Wed, Oct 12, 2016 at 5:07 PM, Attila Simon  wrote:

> Hi All,
>
> I checked the shipped libraries against the LICENSE file in the binary
> tarball. It turned out that there are some discrepancies.
>
> Additional libraries shipped but not mentioned in the licence file:
> hamcrest-core (BSD License), junit (BSD License), xz
> (http://git.tukaani.org/?p=xz.git;a=blob;f=COPYING),
> kafka-clients(Apache License v2),
> lz4(https://github.com/lz4/lz4/blob/master/lib/LICENSE)
>
> Library mentioned in the licence file but not shipped directly:
> zookeeper(Apache License v2)
>
> Libraries which require little correction in the name in the licence file:
> jackson-core-asl-1..jar
> jackson-mapper-asl-1..jar
>
> Suggested actions:
>  1) hamcrest-core, junit these are testing libs, I think shouldn't be
> shipped (only required for tests, I guess they were added by the new
> .*shared.* module)
>  2) kafka-clients, lz4 should be added to our LICENSE file
>  3) jackson.* fix the typo in the LICENSE file
>  4) zookeeper since we depend on zkclient instead we should remove
> this from the LICENSE file.
>
> Cheers,
> Attila
>
>
>
> On Tue, Oct 11, 2016 at 7:23 PM, Balazs Donat Bessenyei
>  wrote:
> >
> > Hi All,
> >
> > This is the tenth release for Apache Flume as a top-level project,
> > version 1.7.0. We are voting on release candidate RC1.
> >
> > It fixes the following issues:
> >   https://raw.githubusercontent.com/apache/flume/flume-1.7/CHANGELOG
> >
> > *** Please cast your vote within the next 72 hours ***
> >
> > The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> > for the source and binary artifacts can be found here:
> >   http://people.apache.org/~bessbd/apache-flume-1.7.0-rc1/
> >
> > Maven staging repo:
> >   https://repository.apache.org/content/repositories/
> orgapacheflume-1018/
> >
> > The tag to be voted on:
> >   https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=1a62453
> >
> > Flume's KEYS file containing PGP keys we use to sign the release:
> >   https://www.apache.org/dist/flume/KEYS
> >
> >
> > Thank you,
> >
> > Donat
>


Re: Review Request 52627: FLUME-2971. Document secure Kafka Sink/Source/Channel setup

2016-10-10 Thread Mike Percy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52627/#review152037
---


Ship it!




Ship It!

- Mike Percy


On Oct. 10, 2016, 11:04 a.m., Attila Simon wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52627/
> ---
> 
> (Updated Oct. 10, 2016, 11:04 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2971
> https://issues.apache.org/jira/browse/FLUME-2971
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> The patch aims to extend the existing documentation of secure Kafka channel 
> with describing SSL+Plaintext setup as well as providing the whole package 
> (SSL+Kerberos+Plain) for KafkaSource and KafkaSink.
> 
> 
> Diffs
> -
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst ab71d38 
> 
> Diff: https://reviews.apache.org/r/52627/diff/
> 
> 
> Testing
> ---
> 
> "mvn site" generated the user guide without an error message in the html. 
> Embedded links are checked not to be broken.
> 
> Known to require attention: Content of the jaas file has to be checked 
> focusing on the requirement of the Client section in every setup.
> 
> 
> Thanks,
> 
> Attila Simon
> 
>



[jira] [Resolved] (FLUME-2999) Kafka channel and sink should enable statically assigned partition per event via header

2016-10-10 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved FLUME-2999.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Pushed to trunk. Thanks for the patch Tristan!

> Kafka channel and sink should enable statically assigned partition per event 
> via header
> ---
>
> Key: FLUME-2999
> URL: https://issues.apache.org/jira/browse/FLUME-2999
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Tristan Stevens
>Assignee: Tristan Stevens
>Priority: Minor
> Fix For: v1.7.0
>
>
> This feature is useful for anyone who needs greater control of which 
> partitions are being written to - normally in a situation where multiple 
> Flume agents are being deployed in order to horizontally scale, or 
> alternatively if there is a scenario where there is a skew in data that might 
> lead to one or more partitions hotspotting.
> We also have the ability to specify custom partitions on to the Kafka 
> Producer itself using the kafka.* configuration properties.
> The Kafka Producer provides the ability to set the partition ID using the 
> following constructor 
> (https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html#ProducerRecord(java.lang.String,%20java.lang.Integer,%20K,%20V%29
>  ), this is just a matter of providing the option to use this constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2999) Kafka channel and sink should enable statically assigned partition per event via header

2016-10-10 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563027#comment-15563027
 ] 

Mike Percy commented on FLUME-2999:
---

+1. I am about to commit the latest rev from FLUME-2999.

> Kafka channel and sink should enable statically assigned partition per event 
> via header
> ---
>
> Key: FLUME-2999
> URL: https://issues.apache.org/jira/browse/FLUME-2999
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Tristan Stevens
>Assignee: Tristan Stevens
>Priority: Minor
>
> This feature is useful for anyone who needs greater control of which 
> partitions are being written to - normally in a situation where multiple 
> Flume agents are being deployed in order to horizontally scale, or 
> alternatively if there is a scenario where there is a skew in data that might 
> lead to one or more partitions hotspotting.
> We also have the ability to specify custom partitions on to the Kafka 
> Producer itself using the kafka.* configuration properties.
> The Kafka Producer provides the ability to set the partition ID using the 
> following constructor 
> (https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html#ProducerRecord(java.lang.String,%20java.lang.Integer,%20K,%20V%29
>  ), this is just a matter of providing the option to use this constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 52598: FLUME-2999 - Kafka channel and sink should enable statically assigned partition per event via header

2016-10-10 Thread Mike Percy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52598/#review152027
---


Ship it!




Ship It!

- Mike Percy


On Oct. 10, 2016, 1:53 a.m., Tristan Stevens wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52598/
> ---
> 
> (Updated Oct. 10, 2016, 1:53 a.m.)
> 
> 
> Review request for Flume and Grant Henke.
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This feature is useful for anyone who needs greater control of which 
> partitions are being written to - normally in a situation where multiple 
> Flume agents are being deployed in order to horizontally scale, or 
> alternatively if there is a scenario where there is a skew in data that might 
> lead to one or more partitions hotspotting.
> We also have the ability to specify custom partitions on to the Kafka 
> Producer itself using the kafka.* configuration properties.
> 
> The Kafka Producer provides the ability to set the partition ID using the 
> following constructor 
> (https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html#ProducerRecord%28java.lang.String,%20java.lang.Integer,%20K,%20V%29
>  ), this is just a matter of providing the option to use this constructor.
> 
> This is specified in one of two ways: either via the staticPartition 
> configuration property, which means that every message goes to the specified 
> partition, or via the partitionHeader configuration property, which directs 
> the implementation to retrieve the partitionId from one of the event headers.
> 
> 
> Diffs
> -
> 
>   flume-ng-channels/flume-kafka-channel/pom.xml c1cc844 
>   
> flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java
>  66b553a 
>   
> flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannelConfiguration.java
>  3ab807b 
>   
> flume-ng-channels/flume-kafka-channel/src/test/java/org/apache/flume/channel/kafka/TestKafkaChannel.java
>  57c0b28 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst ab71d38 
>   flume-ng-sinks/flume-ng-kafka-sink/pom.xml 195c921 
>   
> flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java
>  89bdd84 
>   
> flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSinkConstants.java
>  1bf380c 
>   
> flume-ng-sinks/flume-ng-kafka-sink/src/test/java/org/apache/flume/sink/kafka/TestKafkaSink.java
>  76eca37 
>   flume-ng-sources/flume-kafka-source/pom.xml c89ea1a 
>   flume-shared/flume-shared-kafka-test/pom.xml PRE-CREATION 
>   
> flume-shared/flume-shared-kafka-test/src/main/java/org/apache/flume/shared/kafka/test/KafkaPartitionTestUtil.java
>  PRE-CREATION 
>   
> flume-shared/flume-shared-kafka-test/src/main/java/org/apache/flume/shared/kafka/test/PartitionOption.java
>  PRE-CREATION 
>   
> flume-shared/flume-shared-kafka-test/src/main/java/org/apache/flume/shared/kafka/test/PartitionTestScenario.java
>  PRE-CREATION 
>   flume-shared/pom.xml PRE-CREATION 
>   pom.xml 2332a29 
> 
> Diff: https://reviews.apache.org/r/52598/diff/
> 
> 
> Testing
> ---
> 
> Unit testing done for both Kafka Channel and Kafka Sink.
> 
> 
> Thanks,
> 
> Tristan Stevens
> 
>



[jira] [Updated] (FLUME-2911) Add includePattern option in SpoolDirectorySource configuration

2016-10-10 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2911:
--
Affects Version/s: (was: notrack)

> Add includePattern option in SpoolDirectorySource configuration
> ---
>
> Key: FLUME-2911
> URL: https://issues.apache.org/jira/browse/FLUME-2911
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0, v1.7.0
>Reporter: Andrea Rota
>Assignee: Andrea Rota
>  Labels: features
> Fix For: v1.7.0
>
> Attachments: FLUME-2911.patch
>
>
> Current implementation of SpoolDirectorySource does not allow users to 
> specify a regex pattern to select which files should be monitored. Instead, 
> the current implementation allows users to specify which should *not* 
> monitored, via the ignorePattern parameter.
> I implemented the feature, allowing users to specify the include pattern as 
> {{a1.sources.src-1.includePattern=^foo.*$}} (includes all the files that 
> starts in "foo").
> By default, the includePattern regex is set to {{^.*$}} (all files). Include 
> and exclude patterns can be used at same time and the results are combined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 52627: FLUME-2971. Document secure Kafka Sink/Source/Channel setup

2016-10-10 Thread Mike Percy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52627/#review151999
---



Nice work! FYI I copied the patched documentation into a GitHub gist for easy 
reading (v2 of the patch): 
https://gist.github.com/mpercy/40017fd82cc21af41ddb7cba2b2f4600 since GitHub 
knows how to render ReStructuredText. Consider posting a gist link to the 
rendered document for ease of review next time you make large documentation 
contributions like this.

- Mike Percy


On Oct. 7, 2016, 6:27 a.m., Attila Simon wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52627/
> ---
> 
> (Updated Oct. 7, 2016, 6:27 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2971
> https://issues.apache.org/jira/browse/FLUME-2971
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> The patch aims to extend the existing documentation of secure Kafka channel 
> with describing SSL+Plaintext setup as well as providing the whole package 
> (SSL+Kerberos+Plain) for KafkaSource and KafkaSink.
> 
> 
> Diffs
> -
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst ab71d38 
> 
> Diff: https://reviews.apache.org/r/52627/diff/
> 
> 
> Testing
> ---
> 
> "mvn site" generated the user guide without an error message in the html. 
> Embedded links are checked not to be broken.
> 
> Known to require attention: Content of the jaas file has to be checked 
> focusing on the requirement of the Client section in every setup.
> 
> 
> Thanks,
> 
> Attila Simon
> 
>



Re: Review Request 52627: FLUME-2971. Document secure Kafka Sink/Source/Channel setup

2016-10-10 Thread Mike Percy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52627/#review151993
---




flume-ng-doc/sphinx/FlumeUserGuide.rst (line 1334)
<https://reviews.apache.org/r/52627/#comment220666>

I don't think it's necessary to link to the Cloudera article (actually it's 
not a blog, it's the CDH release notes which are not really relevant to the 
upstream docs). The KAFKA JIRA should be fine.



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 3101)
<https://reviews.apache.org/r/52627/#comment220669>

I agree with Tristan that this information should not be repeated verbatim. 
I think we could simply add a link to this section (the channel section) from 
the source and sink component sections, but keep the component-specific 
examples where appropriate and helpful.



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 3158)
<https://reviews.apache.org/r/52627/#comment220670>

I don't know what this means. Can you clarify what CN and SAN are? Are they 
part of the JAAS spec or something? Can you hyperlink those terms to 
documentation or provide a reference to where we can find more relevant 
information?



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 3203)
<https://reviews.apache.org/r/52627/#comment220665>

nit: please add spaces around the equals signs for consistency, here and 
elsewhere


- Mike Percy


On Oct. 7, 2016, 6:27 a.m., Attila Simon wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52627/
> ---
> 
> (Updated Oct. 7, 2016, 6:27 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2971
> https://issues.apache.org/jira/browse/FLUME-2971
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> The patch aims to extend the existing documentation of secure Kafka channel 
> with describing SSL+Plaintext setup as well as providing the whole package 
> (SSL+Kerberos+Plain) for KafkaSource and KafkaSink.
> 
> 
> Diffs
> -
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst ab71d38 
> 
> Diff: https://reviews.apache.org/r/52627/diff/
> 
> 
> Testing
> ---
> 
> "mvn site" generated the user guide without an error message in the html. 
> Embedded links are checked not to be broken.
> 
> Known to require attention: Content of the jaas file has to be checked 
> focusing on the requirement of the Client section in every setup.
> 
> 
> Thanks,
> 
> Attila Simon
> 
>



[jira] [Updated] (FLUME-2911) Add includePattern option in SpoolDirectorySource configuration

2016-10-10 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2911:
--
Assignee: Andrea Rota

> Add includePattern option in SpoolDirectorySource configuration
> ---
>
> Key: FLUME-2911
> URL: https://issues.apache.org/jira/browse/FLUME-2911
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: notrack, v1.6.0, v1.7.0
>Reporter: Andrea Rota
>Assignee: Andrea Rota
>  Labels: features
> Attachments: FLUME-2911.patch
>
>
> Current implementation of SpoolDirectorySource does not allow users to 
> specify a regex pattern to select which files should be monitored. Instead, 
> the current implementation allows users to specify which should *not* 
> monitored, via the ignorePattern parameter.
> I implemented the feature, allowing users to specify the include pattern as 
> {{a1.sources.src-1.includePattern=^foo.*$}} (includes all the files that 
> starts in "foo").
> By default, the includePattern regex is set to {{^.*$}} (all files). Include 
> and exclude patterns can be used at same time and the results are combined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >