[jira] [Commented] (FLUME-2921) Support Elasticsearch 2.0+

2016-07-12 Thread Lior Zeno (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374402#comment-15374402
 ] 

Lior Zeno commented on FLUME-2921:
--

I think we should avoid a custom classloader, and simply shade/remove 
dependencies as necessary. If a user wants to run both ES and HDFS sinks in the 
same agent, he would have to shade one of them. This is perfectly acceptable.

I think that the best thing to do right now is to block this issue until we 
clean our public API. In the meantime, users may use the REST client with 1.x 
and up clusters.
Since this is a popular sink, there are other projects on GitHub that 
implements elasticsearch 2.x support on Flume (obviously, by breaking 
backwards-compatibility), e.g. 
https://github.com/arberzal/flume-ng-elasticsearch2-sink. 

> Support Elasticsearch 2.0+
> --
>
> Key: FLUME-2921
> URL: https://issues.apache.org/jira/browse/FLUME-2921
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Lior Zeno
>Assignee: Lior Zeno
> Fix For: v1.7.0
>
> Attachments: FLUME-2921-0.patch, FLUME-2921-1.patch
>
>
> Elasticsearch sink supports an ancient version of ES. We should make the sink 
> work with newer versions of Elasticsearch.
> I attached a patch for that. Please note that this involves upgrading netty 
> and guava.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374147#comment-15374147
 ] 

tinawenqiao commented on FLUME-2953:


I feel confused. I thought branch 1.7 is the next release version, so I provide 
patch based on branch 1.7.  What should I do? Who can  guide me?

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374072#comment-15374072
 ] 

Mike Percy commented on FLUME-2953:
---

I don't know why branch 1.7 exists. That branch should not exist yet, since we 
have not branched for 1.7 yet.

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2921) Support Elasticsearch 2.0+

2016-07-12 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374038#comment-15374038
 ] 

Mike Percy commented on FLUME-2921:
---

Moving some discussion back to this JIRA from the code review. Based on your 
investigation in the code review, Flume exposes Guava as part of its public 
API, particularly in Context.java, in public ImmutableMap 
getSubProperties(String prefix). This is really unfortunate as it means that it 
is not possible to shade Guava in Flume.

That means, because Flume has HDFS, Hive, and HBase sinks with transitive deps 
on Guava 11, Flume core is stuck on Guava 11. We cannot upgrade Guava to 
version 18, which is apparently required by the ES 2.0 client.

Short of cutting a Flume 2.0 to break API compatibility (we follow the 
www.semver.org versioning standard), which we should probably do at some point 
(I do not think we should do it instead of 1.7.0), I think we have all bad 
options.

One bad, but potentially feasible option for people who need ES 2.x is the 
following:

* Create an out-of-tree ES 2.x sink on GitHub. It would depend on Guava 18 and 
pull in that dep
* When deploying the ES2 sink with Flume, ensure that the Guava 18 JAR appears 
on the classpath ahead of the Guava 11 JAR
* Note that the above will likely break other sinks, like the HDFS sink, so 
they could not be used in the same agent

This is quite hacky, and I'm not sure it would work without testing it, but I 
think it could be made to work since ImmutableMap appears in Guava 18 and that 
is the only absolutely required Guava dependency in Flume core.

I am open to other options that do not break backcompat in 1.x

On a side note, for Flume 2.x, if someone wants to tackle the custom class 
loader work, that would be very interesting to help avoid these problems. I 
hope we could maintain backcompat for most Flume 1.x plugins in such a world, 
we would only have to break the plugins that call APIs "tainted" by Guava.

> Support Elasticsearch 2.0+
> --
>
> Key: FLUME-2921
> URL: https://issues.apache.org/jira/browse/FLUME-2921
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Lior Zeno
>Assignee: Lior Zeno
> Fix For: v1.7.0
>
> Attachments: FLUME-2921-0.patch, FLUME-2921-1.patch
>
>
> Elasticsearch sink supports an ancient version of ES. We should make the sink 
> work with newer versions of Elasticsearch.
> I attached a patch for that. Please note that this involves upgrading netty 
> and guava.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49458: FLUME-2921 Support Elasticsearch 2.0+

2016-07-12 Thread Mike Percy


> On July 11, 2016, 2:19 p.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/Context.java, line 85
> > 
> >
> > This is an API and ABI breaking change.
> > 
> > We would need to release Flume 2.0 to do this.
> 
> Lior Zeno wrote:
> Guava is shaded, it's problematic to expose a shaded jar in the API.
> 
> Lior Zeno wrote:
> I must add that I think it's a bad practice to expose a foreign namespace 
> in our API. Either org.apache.flume.X or native java namespaces are 
> acceptable. Exposing a foreign namespace results in exactly this problem. We 
> are now going to break the API because of this.

It's really unfortunate that we have exposed Guava in the API. I had lunch with 
Hari today and he was surprised as well. I think this was an oversight on our 
part at the time and something we didn't really intend. We also didn't predict 
that Guava would be so backwards incompatible. Guava is quite dangerous, and in 
hindsight that is very clear now.

Sadly, now I think we are stuck with it. We cannot break API compatibility 
within the 1.x line. We should follow the standard at http://semver.org in this 
regard.

Overall I am probably nearly as frustrated with this problem as you are. I 
don't think there is a clean way out of this.

Some hacky options come to mind in terms of how to move forward. None of them 
are very good. I'll post them on the JIRA and we can discuss our options there.


- Mike


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49458/#review141781
---


On July 9, 2016, 3:13 a.m., Lior Zeno wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49458/
> ---
> 
> (Updated July 9, 2016, 3:13 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2921
> https://issues.apache.org/jira/browse/FLUME-2921
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch adds the support for Elasticsearch version 2.0+. The version I 
> used is 2.3.3, which is the latest stable release.
> This patch does not fix any known issues with this sink, its only purpose is 
> to support current versions of elasticsearch.
> 
> Elasticsearch 2.3.3 depends on guava 18.0, which collided with our version. I 
> had to create a new module, flume-ng-elasticsearch-shaded, and shade guava. 
> This worked this time, but due to guava's popularity I think we should remove 
> this dependency in the future. This should be easier, now that Flume uses 
> Java 1.7.
> 
> 
> Diffs
> -
> 
>   flume-ng-configuration/src/main/java/org/apache/flume/Context.java f00b571 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java
>  b9f2438 
>   flume-ng-dist/src/main/assembly/bin.xml a61180d 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst f9ca1b2 
>   
> flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java
>  c122a12 
>   
> flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
>  a80bfdf 
>   flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml c372c0b 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
>  83c3ffd 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
>  3638368 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
>  2cf365e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
>  9fbd747 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
>  d4e4654 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
>  b62254e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
>  65b4dab 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
>  69acc06 
>   pom.xml b50693e 
> 
> Diff: https://reviews.apache.org/r/49458/diff/
> 
> 
> Testing
> ---
> 
> I made sure that all unit tests (due to guava upgrade) pass successfully. The 
> known flaky tests may not pass, though.
> In addition, I tested the sink against a local elasticsearch instance.
> 
> 
> Thanks,
> 
> Lior Zeno
> 
>



Re: raw data in log messages is a security risk

2016-07-12 Thread Mike Percy
Great, thanks for filing that JIRA Attila. Let's continue the discussion
there.

Mike

On Tue, Jul 12, 2016 at 8:50 AM, Attila Simon  wrote:

> Hi Mike,
>
> I created a jira (https://issues.apache.org/jira/browse/FLUME-2954)
> and would like to look around first in the codebase where such content
> log was introduced. Based on the actual use cases we can discuss
> further what would be the best approach. Thanks for the log4j concerns
> that actually moved my standpoint a bit.
>
> Cheers,
> Attila
>
>
> Attila Simon
> Software Engineer
> Email:   s...@cloudera.com
>
>
>
>
> On Sat, Jul 9, 2016 at 3:06 AM, Mike Percy  wrote:
> > Hi Attila,
> > Thanks for bringing this up. I agree that we should prevent logging data
> > unless it is explicitly enabled.
> >
> > One concern I have about the log4j approach is that many people have
> > customized their log4j.properties file. As long as your proposal would
> keep
> > logging of data disabled for all of the likely customizations people
> might
> > have in place then it sounds good. However maybe you can be a little more
> > specific about how it would look in the log4j.properties file and how it
> > would look at the code level when writing to that named logger. I'm not
> > totally sure I understand your exact proposal.
> >
> > Mike
> >
> > On Tue, Jul 5, 2016 at 9:44 AM, Attila Simon  wrote:
> >
> >> Hi,
> >>
> >> Flume has built in functionality to log out data flowing through
> >> mainly for debugging purposes. This functionality appears in several
> >> places of the codebase. I think such functionality rise security
> >> concerns in production environments where sensitive information might
> >> be ingested so it is crucial that enabling such functionality has to
> >> be as explicit as possible (avoid implicit side effect setup).
> >> Eg: setting the level of root logger to debug/trace cause that every
> >> other logger will start logging at debug/trace including the ones
> >> logging raw data.
> >>
> >> Options to solve this issue:
> >> 1) command line option to enable data logging
> >> 2) configuration property to enable data logging globally
> >> 3) implementing a single concept which is solely responsible for
> >> logging ie a single LoggerSink (which already exists) or Interceptor
> >> 4) introduction of a new named logger instance which is configured OFF
> >> in log4j config
> >> 5) any other idea is welcomed
> >>
> >> Considering the pros and cons of the usage and implementation I would
> >> vote for 4) but I require your opinion. I'm going to open a jira to
> >> tackle this work (please let me know if there are some important
> >> fields I have to set considering 1.7 release).
> >>
> >> Cheers,
> >> Attila
> >>
>


[jira] [Commented] (FLUME-2954) make raw data appearing in log messages explicit

2016-07-12 Thread Mike Percy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373989#comment-15373989
 ] 

Mike Percy commented on FLUME-2954:
---

I agree that simply enabling debug or trace logging in Flume should never log 
actual data, unless that has been explicitly enabled. This makes debugging in 
secure environments practically impossible.

Thanks for looking at this issue.

> make raw data appearing in log messages explicit
> 
>
> Key: FLUME-2954
> URL: https://issues.apache.org/jira/browse/FLUME-2954
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Configuration, Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Attila Simon
>Assignee: Attila Simon
>Priority: Critical
>
> Flume has built in functionality to log out data flowing through
> mainly for debugging purposes. This functionality appears in several
> places of the codebase. I think such functionality rise security
> concerns in production environments where sensitive information might
> be ingested so it is crucial that enabling such functionality has to
> be as explicit as possible (avoid implicit side effect setup).
> Eg: setting the level of root logger to debug/trace cause that every
> other logger will start logging at debug/trace including the ones
> logging raw data.
> In this jira I would like to provide a patch capturing how I imagined solving 
> this issue. It can be refined iteratively or used as a basis for a broader 
> discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Build failed in Jenkins: Flume-trunk-hbase-1 #177

2016-07-12 Thread Mike Percy
On Tue, Jul 12, 2016 at 3:19 AM, Lior Zeno  wrote:

> The integration tests fails since there were previous errors in the build,
> therefore maven did not generate the tarball.
>

Yes, this is exactly right.

Mike


Re: Build failed in Jenkins: Flume-trunk-hbase-1 #177

2016-07-12 Thread Mike Percy
Please see inline...

On Tue, Jul 12, 2016 at 2:01 AM, Attila Simon  wrote:

> "TestFileChannel.testInOut:117 ?  Failed to locate tar-ball
> distribution. Pleas..."
> These seem transient and build related. OOM killer, disk swipe, some
> restart, etc. Would you mind retriggring the job? I don't have
> perimissions for that.
>

I retriggered the job and it passed! Good or bad, you decide. :)

@Mike StagedInstall was last edited by you, do you happen to remember
> whether is there any documentation to catch the idea how this supposed
> to work in details, what are the weak points if there is any?


It was Arvind's baby. :) And he basically did it as a quick hack to get an
integration test framework working. I don't remember doing major surgery to
it. LMK if you have specific questions. One drawback, IIRC, is that we
didn't put in the necessary plumbing to make it start up quickly, so there
are a bunch of sleeps and the tests run slowly.

@Mike: decomposing what you wrote: ~5.6G is the max memory and flume
> would like to allocate its first ~1M block of direct memory. I guess
> jenkins was configured with a non sun/oracle jvm but that shouldn't be
> an issue. If OOM killer did this then it might be related but unlikely
> that 1M triggered it (unless the allowed max direct memory was set to
> something smaller than 1M). Otherwise it seems normal in aspect of
> file channel.
>

I agree that 1MB should not trigger the OOM killer.

This failure is still a mystery to me.

Mike


Re: Review Request 49453: Patch for FLUME-2725

2016-07-12 Thread Mike Percy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49453/#review141976
---




flume-ng-core/src/main/java/org/apache/flume/tools/TimestampRoundDownUtil.java 
(line 26)


Please add annotations to this class: @Private and @Evolving



flume-ng-core/src/main/java/org/apache/flume/tools/TimestampRoundDownUtil.java 
(line 143)


I think adding parenthesis around the condition in the ternary operator 
here would make this line more readable. Would you mind doing that? e.g.:

Calendar cal = (timeZone == null) ? Calendar.getInstance() : 
Calendar.getInstance(timeZone);



flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 (line 57)


nit: You can use Java 7 syntax for the generic and make it new HashMap<>();



flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 (line 93)


Please add a short comment describing the purpose of this test.



flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 (line 127)


Needs test comment



flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 (line 133)


Can you reduce the copy / paste in this test?



flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 (line 155)


Test comment



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 35)


Isn't this just CET? This is the Hungarian time zone, right?



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 54)


test comment



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 56)


I see lines 56-64 duplicated in L95-L103 (and actually maybe even more code 
than that)

Can you extract a method or two to reduce the line count of this patch?



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 93)


test comment



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 111)


Doesn't this assertion fail if you run it while your clock is set to CET?



flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 (line 132)


test comment


- Mike Percy


On July 12, 2016, 1:36 a.m., Denes Arvay wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49453/
> ---
> 
> (Updated July 12, 2016, 1:36 a.m.)
> 
> 
> Review request for Flume, Balázs Donát Bessenyei and Attila Simon.
> 
> 
> Bugs: FLUME-2725
> https://issues.apache.org/jira/browse/FLUME-2725
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> Patch for FLUME-2725 - HDFS Sink does not use configured timezone for rounding
> 
> 
> Diffs
> -
> 
>   
> flume-ng-core/src/main/java/org/apache/flume/formatter/output/BucketPath.java 
> b2fe3f0 
>   
> flume-ng-core/src/main/java/org/apache/flume/tools/TimestampRoundDownUtil.java
>  daa9606 
>   
> flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
>  b1b828a 
>   
> flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
>  1ac11ab 
> 
> Diff: https://reviews.apache.org/r/49453/diff/
> 
> 
> Testing
> ---
> 
> `org.apache.flume.formatter.output.TestBucketPath` and 
> `org.apache.flume.tools.TestTimestampRoundDownUtil` were extended with new 
> methods testing with `TimeZone`. Existing and new tests pass.
> 
> 
> Thanks,
> 
> Denes Arvay
> 
>



Re: Jenkins build is back to normal : Flume-trunk-hbase-1 #178

2016-07-12 Thread Hari Shreedharan
w00t!

On Tue, Jul 12, 2016 at 3:43 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See 
>
>


Re: Review Request 49458: FLUME-2921 Support Elasticsearch 2.0+

2016-07-12 Thread Lior Zeno


> On July 11, 2016, 9:19 p.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/Context.java, line 85
> > 
> >
> > This is an API and ABI breaking change.
> > 
> > We would need to release Flume 2.0 to do this.
> 
> Lior Zeno wrote:
> Guava is shaded, it's problematic to expose a shaded jar in the API.

I must add that I think it's a bad practice to expose a foreign namespace in 
our API. Either org.apache.flume.X or native java namespaces are acceptable. 
Exposing a foreign namespace results in exactly this problem. We are now going 
to break the API because of this.


- Lior


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49458/#review141781
---


On July 9, 2016, 10:13 a.m., Lior Zeno wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49458/
> ---
> 
> (Updated July 9, 2016, 10:13 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2921
> https://issues.apache.org/jira/browse/FLUME-2921
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch adds the support for Elasticsearch version 2.0+. The version I 
> used is 2.3.3, which is the latest stable release.
> This patch does not fix any known issues with this sink, its only purpose is 
> to support current versions of elasticsearch.
> 
> Elasticsearch 2.3.3 depends on guava 18.0, which collided with our version. I 
> had to create a new module, flume-ng-elasticsearch-shaded, and shade guava. 
> This worked this time, but due to guava's popularity I think we should remove 
> this dependency in the future. This should be easier, now that Flume uses 
> Java 1.7.
> 
> 
> Diffs
> -
> 
>   flume-ng-configuration/src/main/java/org/apache/flume/Context.java f00b571 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java
>  b9f2438 
>   flume-ng-dist/src/main/assembly/bin.xml a61180d 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst f9ca1b2 
>   
> flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java
>  c122a12 
>   
> flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
>  a80bfdf 
>   flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml c372c0b 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
>  83c3ffd 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
>  3638368 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
>  2cf365e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
>  9fbd747 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
>  d4e4654 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
>  b62254e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
>  65b4dab 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
>  69acc06 
>   pom.xml b50693e 
> 
> Diff: https://reviews.apache.org/r/49458/diff/
> 
> 
> Testing
> ---
> 
> I made sure that all unit tests (due to guava upgrade) pass successfully. The 
> known flaky tests may not pass, though.
> In addition, I tested the sink against a local elasticsearch instance.
> 
> 
> Thanks,
> 
> Lior Zeno
> 
>



Re: raw data in log messages is a security risk

2016-07-12 Thread Attila Simon
Hi Mike,

I created a jira (https://issues.apache.org/jira/browse/FLUME-2954)
and would like to look around first in the codebase where such content
log was introduced. Based on the actual use cases we can discuss
further what would be the best approach. Thanks for the log4j concerns
that actually moved my standpoint a bit.

Cheers,
Attila


Attila Simon
Software Engineer
Email:   s...@cloudera.com




On Sat, Jul 9, 2016 at 3:06 AM, Mike Percy  wrote:
> Hi Attila,
> Thanks for bringing this up. I agree that we should prevent logging data
> unless it is explicitly enabled.
>
> One concern I have about the log4j approach is that many people have
> customized their log4j.properties file. As long as your proposal would keep
> logging of data disabled for all of the likely customizations people might
> have in place then it sounds good. However maybe you can be a little more
> specific about how it would look in the log4j.properties file and how it
> would look at the code level when writing to that named logger. I'm not
> totally sure I understand your exact proposal.
>
> Mike
>
> On Tue, Jul 5, 2016 at 9:44 AM, Attila Simon  wrote:
>
>> Hi,
>>
>> Flume has built in functionality to log out data flowing through
>> mainly for debugging purposes. This functionality appears in several
>> places of the codebase. I think such functionality rise security
>> concerns in production environments where sensitive information might
>> be ingested so it is crucial that enabling such functionality has to
>> be as explicit as possible (avoid implicit side effect setup).
>> Eg: setting the level of root logger to debug/trace cause that every
>> other logger will start logging at debug/trace including the ones
>> logging raw data.
>>
>> Options to solve this issue:
>> 1) command line option to enable data logging
>> 2) configuration property to enable data logging globally
>> 3) implementing a single concept which is solely responsible for
>> logging ie a single LoggerSink (which already exists) or Interceptor
>> 4) introduction of a new named logger instance which is configured OFF
>> in log4j config
>> 5) any other idea is welcomed
>>
>> Considering the pros and cons of the usage and implementation I would
>> vote for 4) but I require your opinion. I'm going to open a jira to
>> tackle this work (please let me know if there are some important
>> fields I have to set considering 1.7 release).
>>
>> Cheers,
>> Attila
>>


[jira] [Created] (FLUME-2954) make raw data appearing in log messages explicit

2016-07-12 Thread Attila Simon (JIRA)
Attila Simon created FLUME-2954:
---

 Summary: make raw data appearing in log messages explicit
 Key: FLUME-2954
 URL: https://issues.apache.org/jira/browse/FLUME-2954
 Project: Flume
  Issue Type: Improvement
  Components: Channel, Configuration, Sinks+Sources
Affects Versions: v1.6.0
Reporter: Attila Simon
Assignee: Attila Simon
Priority: Critical


Flume has built in functionality to log out data flowing through
mainly for debugging purposes. This functionality appears in several
places of the codebase. I think such functionality rise security
concerns in production environments where sensitive information might
be ingested so it is crucial that enabling such functionality has to
be as explicit as possible (avoid implicit side effect setup).
Eg: setting the level of root logger to debug/trace cause that every
other logger will start logging at debug/trace including the ones
logging raw data.

In this jira I would like to provide a patch capturing how I imagined solving 
this issue. It can be refined iteratively or used as a basis for a broader 
discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372948#comment-15372948
 ] 

tinawenqiao commented on FLUME-2953:


Hi, Attila. The patch works on branch flume-1.7, not on trunk branch.

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread Attila Simon (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372943#comment-15372943
 ] 

Attila Simon commented on FLUME-2953:
-

Thanks for the patch [~wenqiao],
I tried to applied it on trunk but it failed. Could you please rebase it to the 
latest version?

{noformat}
patch -p1 < FLUME-2953.patch
patching file flume-ng-doc/sphinx/FlumeUserGuide.rst
Hunk #1 succeeded at 1120 (offset 4 lines).
Hunk #2 succeeded at 1148 (offset 8 lines).
patching file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java
Hunk #1 FAILED at 83.
Hunk #2 FAILED at 276.
2 out of 2 hunks FAILED -- saving rejects to file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java.rej
patching file 
flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
Hunk #1 succeeded at 75 (offset 7 lines).
Hunk #2 succeeded at 143 (offset 7 lines).
{noformat}

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread Attila Simon (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372943#comment-15372943
 ] 

Attila Simon edited comment on FLUME-2953 at 7/12/16 2:17 PM:
--

Thanks for the patch [~wenqiao],
I tried to apply it on trunk but failed. Could you please rebase it to the 
latest version?

{noformat}
patch -p1 < FLUME-2953.patch
patching file flume-ng-doc/sphinx/FlumeUserGuide.rst
Hunk #1 succeeded at 1120 (offset 4 lines).
Hunk #2 succeeded at 1148 (offset 8 lines).
patching file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java
Hunk #1 FAILED at 83.
Hunk #2 FAILED at 276.
2 out of 2 hunks FAILED -- saving rejects to file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java.rej
patching file 
flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
Hunk #1 succeeded at 75 (offset 7 lines).
Hunk #2 succeeded at 143 (offset 7 lines).
{noformat}


was (Author: sati):
Thanks for the patch [~wenqiao],
I tried to applied it on trunk but it failed. Could you please rebase it to the 
latest version?

{noformat}
patch -p1 < FLUME-2953.patch
patching file flume-ng-doc/sphinx/FlumeUserGuide.rst
Hunk #1 succeeded at 1120 (offset 4 lines).
Hunk #2 succeeded at 1148 (offset 8 lines).
patching file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java
Hunk #1 FAILED at 83.
Hunk #2 FAILED at 276.
2 out of 2 hunks FAILED -- saving rejects to file 
flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java.rej
patching file 
flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
Hunk #1 succeeded at 75 (offset 7 lines).
Hunk #2 succeeded at 143 (offset 7 lines).
{noformat}

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tinawenqiao updated FLUME-2953:
---
Attachment: FLUME-2953.patch

> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
> Attachments: FLUME-2953.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49458: FLUME-2921 Support Elasticsearch 2.0+

2016-07-12 Thread Lior Zeno


> On July 11, 2016, 9:19 p.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/Context.java, line 85
> > 
> >
> > This is an API and ABI breaking change.
> > 
> > We would need to release Flume 2.0 to do this.

Guava is shaded, it's problematic to expose a shaded jar in the API.


- Lior


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49458/#review141781
---


On July 9, 2016, 10:13 a.m., Lior Zeno wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49458/
> ---
> 
> (Updated July 9, 2016, 10:13 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2921
> https://issues.apache.org/jira/browse/FLUME-2921
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch adds the support for Elasticsearch version 2.0+. The version I 
> used is 2.3.3, which is the latest stable release.
> This patch does not fix any known issues with this sink, its only purpose is 
> to support current versions of elasticsearch.
> 
> Elasticsearch 2.3.3 depends on guava 18.0, which collided with our version. I 
> had to create a new module, flume-ng-elasticsearch-shaded, and shade guava. 
> This worked this time, but due to guava's popularity I think we should remove 
> this dependency in the future. This should be easier, now that Flume uses 
> Java 1.7.
> 
> 
> Diffs
> -
> 
>   flume-ng-configuration/src/main/java/org/apache/flume/Context.java f00b571 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java
>  b9f2438 
>   flume-ng-dist/src/main/assembly/bin.xml a61180d 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst f9ca1b2 
>   
> flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java
>  c122a12 
>   
> flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
>  a80bfdf 
>   flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml c372c0b 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
>  83c3ffd 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
>  3638368 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
>  2cf365e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
>  9fbd747 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
>  d4e4654 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
>  b62254e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
>  65b4dab 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
>  69acc06 
>   pom.xml b50693e 
> 
> Diff: https://reviews.apache.org/r/49458/diff/
> 
> 
> Testing
> ---
> 
> I made sure that all unit tests (due to guava upgrade) pass successfully. The 
> known flaky tests may not pass, though.
> In addition, I tested the sink against a local elasticsearch instance.
> 
> 
> Thanks,
> 
> Lior Zeno
> 
>



Re: Review Request 49458: FLUME-2921 Support Elasticsearch 2.0+

2016-07-12 Thread Lior Zeno


> On July 11, 2016, 9:16 p.m., Mike Percy wrote:
> > flume-ng-dist/src/main/assembly/bin.xml, line 41
> > 
> >
> > How sure are you that this is safe? You are sure that there are no 
> > sources or sinks that use Guava without an external optional dependency 
> > that provides it?

Not sure. I think I can omit that since the user has to override other deps, 
e.g. Jackson.


- Lior


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49458/#review141780
---


On July 9, 2016, 10:13 a.m., Lior Zeno wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49458/
> ---
> 
> (Updated July 9, 2016, 10:13 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2921
> https://issues.apache.org/jira/browse/FLUME-2921
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> This patch adds the support for Elasticsearch version 2.0+. The version I 
> used is 2.3.3, which is the latest stable release.
> This patch does not fix any known issues with this sink, its only purpose is 
> to support current versions of elasticsearch.
> 
> Elasticsearch 2.3.3 depends on guava 18.0, which collided with our version. I 
> had to create a new module, flume-ng-elasticsearch-shaded, and shade guava. 
> This worked this time, but due to guava's popularity I think we should remove 
> this dependency in the future. This should be easier, now that Flume uses 
> Java 1.7.
> 
> 
> Diffs
> -
> 
>   flume-ng-configuration/src/main/java/org/apache/flume/Context.java f00b571 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java
>  b9f2438 
>   flume-ng-dist/src/main/assembly/bin.xml a61180d 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst f9ca1b2 
>   
> flume-ng-embedded-agent/src/test/java/org/apache/flume/agent/embedded/TestEmbeddedAgentEmbeddedSource.java
>  c122a12 
>   
> flume-ng-node/src/main/java/org/apache/flume/node/MaterializedConfiguration.java
>  a80bfdf 
>   flume-ng-sinks/flume-ng-elasticsearch-sink/pom.xml c372c0b 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
>  83c3ffd 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchLogStashEventSerializer.java
>  3638368 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/client/ElasticSearchTransportClient.java
>  2cf365e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/AbstractElasticSearchSinkTest.java
>  9fbd747 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchDynamicSerializer.java
>  d4e4654 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchIndexRequestBuilderFactory.java
>  b62254e 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchLogStashEventSerializer.java
>  65b4dab 
>   
> flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java
>  69acc06 
>   pom.xml b50693e 
> 
> Diff: https://reviews.apache.org/r/49458/diff/
> 
> 
> Testing
> ---
> 
> I made sure that all unit tests (due to guava upgrade) pass successfully. The 
> known flaky tests may not pass, though.
> In addition, I tested the sink against a local elasticsearch instance.
> 
> 
> Thanks,
> 
> Lior Zeno
> 
>



Jenkins build is back to normal : Flume-trunk-hbase-1 #178

2016-07-12 Thread Apache Jenkins Server
See 



Re: Build failed in Jenkins: Flume-trunk-hbase-1 #177

2016-07-12 Thread Lior Zeno
The integration tests fails since there were previous errors in the build,
therefore maven did not generate the tarball.

On Tue, Jul 12, 2016 at 12:01 PM, Attila Simon  wrote:

> Hi,
>
> "TestFileChannel.testInOut:117 ?  Failed to locate tar-ball
> distribution. Pleas..."
> These seem transient and build related. OOM killer, disk swipe, some
> restart, etc. Would you mind retriggring the job? I don't have
> perimissions for that.
>
>
> Regression
>
> org.apache.flume.test.agent.TestSpooldirSource.testManySpooldirs
>
> Failing for the past 1 build (Since #176 )
> Took 15 ms.
>
> Error Message
>
> Failed to locate tar-ball distribution. Please specify explicitly via
> system property: flume.dist.tarball
>
> Stacktrace
>
> java.lang.Exception: Failed to locate tar-ball distribution. Please
> specify explicitly via system property: flume.dist.tarball
> at org.apache.flume.test.util.StagedInstall.(StagedInstall.java:219)
> at
> org.apache.flume.test.util.StagedInstall.getInstance(StagedInstall.java:76)
> at
> org.apache.flume.test.agent.TestSpooldirSource.setup(TestSpooldirSource.java:59)
>
> Standard Output
>
> 2016-07-11 00:07:44,715 (main) [INFO -
> org.apache.flume.test.util.StagedInstall.(StagedInstall.java:211)]
> No value specified for system property: flume.dist.tarball. Will
> attempt to use relative path to locate dist tarball.
> 2016-07-11 00:07:44,719 (main) [INFO -
> org.apache.flume.test.util.StagedInstall.(StagedInstall.java:211)]
> No value specified for system property: flume.dist.tarball. Will
> attempt to use relative path to locate dist tarball.
>
> @Mike StagedInstall was last edited by you, do you happen to remember
> whether is there any documentation to catch the idea how this supposed
> to work in details, what are the weak points if there is any? (I got a
> high level just by looking at the code but would be good to know the
> original intent and reasoning why it was implemented in exactly this
> way)
>
> @Mike: decomposing what you wrote: ~5.6G is the max memory and flume
> would like to allocate its first ~1M block of direct memory. I guess
> jenkins was configured with a non sun/oracle jvm but that shouldn't be
> an issue. If OOM killer did this then it might be related but unlikely
> that 1M triggered it (unless the allowed max direct memory was set to
> something smaller than 1M). Otherwise it seems normal in aspect of
> file channel.
>
> The corresponding code is this:
> private static long getDefaultDirectMemorySize() {
>   try {
> Class VM = Class.forName("sun.misc.VM");
> Method maxDirectMemory = VM.getDeclaredMethod("maxDirectMemory",
> (Class)null);
> Object result = maxDirectMemory.invoke(null, (Object[])null);
> if (result != null && result instanceof Long) {
>   return (Long)result;
> }
>   } catch (Exception e) {
> LOG.info("Unable to get maxDirectMemory from VM: " +
> e.getClass().getSimpleName() + ": " + e.getMessage());
>   }
>   // default according to VM.maxDirectMemory()
>   return Runtime.getRuntime().maxMemory();
> }
>
> Attila Simon
> Software Engineer
> Email:   s...@cloudera.com
>
>
>
>
> On Tue, Jul 12, 2016 at 5:38 AM, Lior Zeno  wrote:
> > It's weird, not sure why the test was killed. Maybe the output file has a
> > few more hints.
> >
> > Could you please upload the output file? I can't access it since it
> > requires a Jenkins user.
> > On Jul 12, 2016 00:51, "Mike Percy"  wrote:
> >
> > TestFileChannel was killed for some reason on this run.
> >
> > From https://builds.apache.org/job/Flume-trunk-hbase-1/177/consoleFull :
> >
> > ---
> >  T E S T S
> > ---
> > Running org.apache.flume.channel.file.TestLogFile
> > Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.997
> sec
> > Running org.apache.flume.channel.file.TestFileChannel
> > Killed
> >
> >
> > I wonder if the OOM killer got it or something?
> >
> > Not a lot to go on from the test log:
> >
> https://builds.apache.org/job/Flume-trunk-hbase-1/ws/flume-ng-channels/flume-file-channel/target/surefire-reports/org.apache.flume.channel.file.TestLogFile-output.txt
> >
> > Except this at the top:
> >
> > 2016-07-11 20:51:58,683 (main) [INFO -
> >
> org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize(DirectMemoryUtils.java:112)]
> > Unable to get maxDirectMemory from VM: NoSuchMethodException:
> > sun.misc.VM.maxDirectMemory(null)
> > 2016-07-11 20:51:58,698 (main) [INFO -
> >
> org.apache.flume.tools.DirectMemoryUtils.allocate(DirectMemoryUtils.java:48)]
> > Direct Memory Allocation:  Allocation = 1048576, Allocated = 0,
> > MaxDirectMemorySize = 5616697344, Remaining = 5616697344
> >
> > I wonder if that could have anything to do with this. I don't have much
> > time to investigate right now, though.
> >
> > Mike
> >
> > On Mon, Jul 11, 2016 at 2:24 PM, Apache Jenkins Server <

Re: Review Request 49453: Patch for FLUME-2725

2016-07-12 Thread Attila Simon

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49453/#review141876
---


Ship it!




Ship It!

- Attila Simon


On July 12, 2016, 8:36 a.m., Denes Arvay wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49453/
> ---
> 
> (Updated July 12, 2016, 8:36 a.m.)
> 
> 
> Review request for Flume, Balázs Donát Bessenyei and Attila Simon.
> 
> 
> Bugs: FLUME-2725
> https://issues.apache.org/jira/browse/FLUME-2725
> 
> 
> Repository: flume-git
> 
> 
> Description
> ---
> 
> Patch for FLUME-2725 - HDFS Sink does not use configured timezone for rounding
> 
> 
> Diffs
> -
> 
>   
> flume-ng-core/src/main/java/org/apache/flume/formatter/output/BucketPath.java 
> b2fe3f0 
>   
> flume-ng-core/src/main/java/org/apache/flume/tools/TimestampRoundDownUtil.java
>  daa9606 
>   
> flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
>  b1b828a 
>   
> flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
>  1ac11ab 
> 
> Diff: https://reviews.apache.org/r/49453/diff/
> 
> 
> Testing
> ---
> 
> `org.apache.flume.formatter.output.TestBucketPath` and 
> `org.apache.flume.tools.TestTimestampRoundDownUtil` were extended with new 
> methods testing with `TimeZone`. Existing and new tests pass.
> 
> 
> Thanks,
> 
> Denes Arvay
> 
>



[jira] [Commented] (FLUME-2942) AvroEventDeserializer ignores header from spool source

2016-07-12 Thread Sebastian Alfers (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372573#comment-15372573
 ] 

Sebastian Alfers commented on FLUME-2942:
-

Hi [~mpercy] , thanks for you reply.

This is our config:

# AGENT SETTINGS
agent1.channels = ch1
agent1.sources = thriftSrc spool
agent1.sinks = kafka fileroll
agent1.sinkgroups = g1

# MEMORY CHANNEL
agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 1
agent1.channels.ch1.transactionCapacity = 500

# THRIFT (source)
agent1.sources.thriftSrc.type = thrift
agent1.sources.thriftSrc.channels = ch1
agent1.sources.thriftSrc.bind = 0.0.0.0
agent1.sources.thriftSrc.port = 4042

# SPOOLDIR (source)
agent1.sources.spool.type = spooldir
agent1.sources.spool.channels = ch1
agent1.sources.spool.spoolDir = /opt/flume-ng/failover/spool
agent1.sources.spool.fileHeader = true

agent1.sources.spool.deserializer = AVRO

agent1.sources.thriftSrc.threads = 150

agent1.sinks.kafka.channel = ch1 
agent1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
agent1.sinks.kafka.batchSize = 50
agent1.sinks.kafka.brokerList = 
plista590.plista.com:9092,plista591.plista.com:9092
#agent1.sinks.kafka.topic = HPTStream.raw


# FILE ROLL (failover sink)
agent1.sinks.fileroll.type = file_roll
agent1.sinks.fileroll.channel = ch1
agent1.sinks.fileroll.sink.directory = /opt/flume-ng/failover/data
agent1.sinks.fileroll.sink.serializer = avro_event

# FAILOVER GROUP
agent1.sinkgroups.g1.sinks = kafka fileroll
agent1.sinkgroups.g1.processor.type = failover
agent1.sinkgroups.g1.processor.priority.kafka = 10
agent1.sinkgroups.g1.processor.priority.fileroll = 5
agent1.sinkgroups.g1.processor.maxpenalty = 1

Please look at the agent1.sources.spool.deserializer config. It refers to the 
reference above.

Here, we use our FQCN to apply the fix.

> AvroEventDeserializer ignores header from spool source
> --
>
> Key: FLUME-2942
> URL: https://issues.apache.org/jira/browse/FLUME-2942
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Sebastian Alfers
>
> I have a spool file source and use avro for de-/serialization
> In detail, serialized events store the topic of the kafka sink in the header.
> When I load the events from the spool directory, the header are ignored. 
> Please see: 
> https://github.com/apache/flume/blob/caa64a1a6d4bc97be5993cb468516e9ffe862794/flume-ng-core/src/main/java/org/apache/flume/serialization/AvroEventDeserializer.java#L122
> You can see, it uses the whole event as body but does not distinguish between 
> the header and body encoded by avro.
> Please verify that this is a bug.
> I fixed this but by using the record that stores header and body separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Build failed in Jenkins: Flume-trunk-hbase-1 #177

2016-07-12 Thread Attila Simon
Hi,

"TestFileChannel.testInOut:117 ?  Failed to locate tar-ball
distribution. Pleas..."
These seem transient and build related. OOM killer, disk swipe, some
restart, etc. Would you mind retriggring the job? I don't have
perimissions for that.


Regression

org.apache.flume.test.agent.TestSpooldirSource.testManySpooldirs

Failing for the past 1 build (Since #176 )
Took 15 ms.

Error Message

Failed to locate tar-ball distribution. Please specify explicitly via
system property: flume.dist.tarball

Stacktrace

java.lang.Exception: Failed to locate tar-ball distribution. Please
specify explicitly via system property: flume.dist.tarball
at org.apache.flume.test.util.StagedInstall.(StagedInstall.java:219)
at org.apache.flume.test.util.StagedInstall.getInstance(StagedInstall.java:76)
at 
org.apache.flume.test.agent.TestSpooldirSource.setup(TestSpooldirSource.java:59)

Standard Output

2016-07-11 00:07:44,715 (main) [INFO -
org.apache.flume.test.util.StagedInstall.(StagedInstall.java:211)]
No value specified for system property: flume.dist.tarball. Will
attempt to use relative path to locate dist tarball.
2016-07-11 00:07:44,719 (main) [INFO -
org.apache.flume.test.util.StagedInstall.(StagedInstall.java:211)]
No value specified for system property: flume.dist.tarball. Will
attempt to use relative path to locate dist tarball.

@Mike StagedInstall was last edited by you, do you happen to remember
whether is there any documentation to catch the idea how this supposed
to work in details, what are the weak points if there is any? (I got a
high level just by looking at the code but would be good to know the
original intent and reasoning why it was implemented in exactly this
way)

@Mike: decomposing what you wrote: ~5.6G is the max memory and flume
would like to allocate its first ~1M block of direct memory. I guess
jenkins was configured with a non sun/oracle jvm but that shouldn't be
an issue. If OOM killer did this then it might be related but unlikely
that 1M triggered it (unless the allowed max direct memory was set to
something smaller than 1M). Otherwise it seems normal in aspect of
file channel.

The corresponding code is this:
private static long getDefaultDirectMemorySize() {
  try {
Class VM = Class.forName("sun.misc.VM");
Method maxDirectMemory = VM.getDeclaredMethod("maxDirectMemory",
(Class)null);
Object result = maxDirectMemory.invoke(null, (Object[])null);
if (result != null && result instanceof Long) {
  return (Long)result;
}
  } catch (Exception e) {
LOG.info("Unable to get maxDirectMemory from VM: " +
e.getClass().getSimpleName() + ": " + e.getMessage());
  }
  // default according to VM.maxDirectMemory()
  return Runtime.getRuntime().maxMemory();
}

Attila Simon
Software Engineer
Email:   s...@cloudera.com




On Tue, Jul 12, 2016 at 5:38 AM, Lior Zeno  wrote:
> It's weird, not sure why the test was killed. Maybe the output file has a
> few more hints.
>
> Could you please upload the output file? I can't access it since it
> requires a Jenkins user.
> On Jul 12, 2016 00:51, "Mike Percy"  wrote:
>
> TestFileChannel was killed for some reason on this run.
>
> From https://builds.apache.org/job/Flume-trunk-hbase-1/177/consoleFull :
>
> ---
>  T E S T S
> ---
> Running org.apache.flume.channel.file.TestLogFile
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.997 sec
> Running org.apache.flume.channel.file.TestFileChannel
> Killed
>
>
> I wonder if the OOM killer got it or something?
>
> Not a lot to go on from the test log:
> https://builds.apache.org/job/Flume-trunk-hbase-1/ws/flume-ng-channels/flume-file-channel/target/surefire-reports/org.apache.flume.channel.file.TestLogFile-output.txt
>
> Except this at the top:
>
> 2016-07-11 20:51:58,683 (main) [INFO -
> org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize(DirectMemoryUtils.java:112)]
> Unable to get maxDirectMemory from VM: NoSuchMethodException:
> sun.misc.VM.maxDirectMemory(null)
> 2016-07-11 20:51:58,698 (main) [INFO -
> org.apache.flume.tools.DirectMemoryUtils.allocate(DirectMemoryUtils.java:48)]
> Direct Memory Allocation:  Allocation = 1048576, Allocated = 0,
> MaxDirectMemorySize = 5616697344, Remaining = 5616697344
>
> I wonder if that could have anything to do with this. I don't have much
> time to investigate right now, though.
>
> Mike
>
> On Mon, Jul 11, 2016 at 2:24 PM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> See 
>>
>> --
>> [...truncated 3122 lines...]
>> Tests in error:
>>   TestFileChannel.testInOut:117 ?  Failed to locate tar-ball distribution.
>> Pleas...
>>   TestFileChannel.tearDown:95 ?  Failed to locate tar-ball distribution.
>> Please ...
>>   TestSpooldirSource.setup:59 ?  Failed to locate 

[jira] [Updated] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tinawenqiao updated FLUME-2953:
---
Description: 
In TaildirSource filegroupName, regular expression can be used for filename 
only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
If there are many files to be tracked in the same directory, the configuration 
is oft-repeated. So it‘s necessary that wildcards are supported in the 
directory path. Then the user can configure the filegroupName like this:
 a1.sources.r1.filegroups.f2 = /var/log/*/.*log.*


  was:
In TaildirSource filegroupName, regular expression can be used for filename 
only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.*log.*
If there are many files to be tracked in the same directory, the configuration 
is oft-repeated. So it‘s necessary that wildcards are supported in the 
directory path. Then the user can configure the filegroupName like this:
 a1.sources.r1.filegroups.f2 = /var/log/*/.*log.*



> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/*/.*log.*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tinawenqiao updated FLUME-2953:
---
Description: 
In TaildirSource filegroupName, regular expression can be used for filename 
only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
If there are many files to be tracked in the same directory, the configuration 
is oft-repeated. So it‘s necessary that wildcards are supported in the 
directory path. Then the user can configure the filegroupName like this:
 a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*


  was:
In TaildirSource filegroupName, regular expression can be used for filename 
only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
If there are many files to be tracked in the same directory, the configuration 
is oft-repeated. So it‘s necessary that wildcards are supported in the 
directory path. Then the user can configure the filegroupName like this:
 a1.sources.r1.filegroups.f2 = /var/log/*/.*log.*



> Make TaildirSource work with recursive directory
> 
>
> Key: FLUME-2953
> URL: https://issues.apache.org/jira/browse/FLUME-2953
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: tinawenqiao
>  Labels: Recuresive, TaildirSource, Wildcards
> Fix For: v1.7.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In TaildirSource filegroupName, regular expression can be used for filename 
> only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.\*log.\*
> If there are many files to be tracked in the same directory, the 
> configuration is oft-repeated. So it‘s necessary that wildcards are supported 
> in the directory path. Then the user can configure the filegroupName like 
> this:
>  a1.sources.r1.filegroups.f2 = /var/log/\*/.\*log.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2953) Make TaildirSource work with recursive directory

2016-07-12 Thread tinawenqiao (JIRA)
tinawenqiao created FLUME-2953:
--

 Summary: Make TaildirSource work with recursive directory
 Key: FLUME-2953
 URL: https://issues.apache.org/jira/browse/FLUME-2953
 Project: Flume
  Issue Type: Improvement
  Components: Sinks+Sources
Affects Versions: v1.7.0
Reporter: tinawenqiao
 Fix For: v1.7.0


In TaildirSource filegroupName, regular expression can be used for filename 
only. Sample usage is : a1.sources.r1.filegroups.f2 = /var/log/test2/.*log.*
If there are many files to be tracked in the same directory, the configuration 
is oft-repeated. So it‘s necessary that wildcards are supported in the 
directory path. Then the user can configure the filegroupName like this:
 a1.sources.r1.filegroups.f2 = /var/log/*/.*log.*




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2725) HDFS Sink does not use configured timezone for rounding

2016-07-12 Thread Denes Arvay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372535#comment-15372535
 ] 

Denes Arvay edited comment on FLUME-2725 at 7/12/16 8:40 AM:
-

new patch: rebased on trunk + fixed checkstyle errors


was (Author: denes):
rebased on trunk + fixed checkstyle errors

> HDFS Sink does not use configured timezone for rounding
> ---
>
> Key: FLUME-2725
> URL: https://issues.apache.org/jira/browse/FLUME-2725
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Reporter: Eric Czech
>Assignee: Denes Arvay
>Priority: Minor
> Attachments: FLUME-2725-2.patch, FLUME-2725.patch
>
>
> When a BucketPath used by an HDFS sink is configured to run with some 
> roundUnit and roundValue > 1 (e.g. 6 hours), the "roundDown" function used by 
> BucketPath does not actually round the date correctly.
> That function calls TimestampRoundDownUtil which creates a Calendar instance 
> using the *local* timezone to truncate a unix timestamp rather than the 
> TimeZone that the sink was configured to convert dates to paths with (and 
> that timezone is already available in the BucketPath class but it just isn't 
> passed to TimestampRoundDownUtil).
> The net effect of this is that if a flume jvm is running on a system with an 
> EST clock while trying to write, say, 6 hour directories in UTC time, the 
> directories are written with the hours 04, 10, 16, 22 rather than 00, 06, 12, 
> 18 like you would expect.
> I found a workaround for this by passing 
> "-Duser.timezone=" as a system property, but I wanted to 
> create a ticket for this since it seems like it would be very minimal effort 
> to carry that configured timezone down into the rounding utility as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2725) HDFS Sink does not use configured timezone for rounding

2016-07-12 Thread Denes Arvay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denes Arvay updated FLUME-2725:
---
Attachment: FLUME-2725-2.patch

rebased on trunk + fixed checkstyle errors

> HDFS Sink does not use configured timezone for rounding
> ---
>
> Key: FLUME-2725
> URL: https://issues.apache.org/jira/browse/FLUME-2725
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Reporter: Eric Czech
>Assignee: Denes Arvay
>Priority: Minor
> Attachments: FLUME-2725-2.patch, FLUME-2725.patch
>
>
> When a BucketPath used by an HDFS sink is configured to run with some 
> roundUnit and roundValue > 1 (e.g. 6 hours), the "roundDown" function used by 
> BucketPath does not actually round the date correctly.
> That function calls TimestampRoundDownUtil which creates a Calendar instance 
> using the *local* timezone to truncate a unix timestamp rather than the 
> TimeZone that the sink was configured to convert dates to paths with (and 
> that timezone is already available in the BucketPath class but it just isn't 
> passed to TimestampRoundDownUtil).
> The net effect of this is that if a flume jvm is running on a system with an 
> EST clock while trying to write, say, 6 hour directories in UTC time, the 
> directories are written with the hours 04, 10, 16, 22 rather than 00, 06, 12, 
> 18 like you would expect.
> I found a workaround for this by passing 
> "-Duser.timezone=" as a system property, but I wanted to 
> create a ticket for this since it seems like it would be very minimal effort 
> to carry that configured timezone down into the rounding utility as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49453: Patch for FLUME-2725

2016-07-12 Thread Denes Arvay

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49453/
---

(Updated July 12, 2016, 8:36 a.m.)


Review request for Flume, Balázs Donát Bessenyei and Attila Simon.


Changes
---

rebased on trunk + fixed checkstyle errors


Bugs: FLUME-2725
https://issues.apache.org/jira/browse/FLUME-2725


Repository: flume-git


Description
---

Patch for FLUME-2725 - HDFS Sink does not use configured timezone for rounding


Diffs (updated)
-

  flume-ng-core/src/main/java/org/apache/flume/formatter/output/BucketPath.java 
b2fe3f0 
  
flume-ng-core/src/main/java/org/apache/flume/tools/TimestampRoundDownUtil.java 
daa9606 
  
flume-ng-core/src/test/java/org/apache/flume/formatter/output/TestBucketPath.java
 b1b828a 
  
flume-ng-core/src/test/java/org/apache/flume/tools/TestTimestampRoundDownUtil.java
 1ac11ab 

Diff: https://reviews.apache.org/r/49453/diff/


Testing
---

`org.apache.flume.formatter.output.TestBucketPath` and 
`org.apache.flume.tools.TestTimestampRoundDownUtil` were extended with new 
methods testing with `TimeZone`. Existing and new tests pass.


Thanks,

Denes Arvay



Re: Build failed in Jenkins: Flume-trunk-hbase-1 #177

2016-07-12 Thread Mike Percy
That is very strange. I don't know what that would be that way.

Anyway, here you go:
https://gist.github.com/mpercy/dcf5672d5978eddd712f6c12c515f44c

Mike

On Mon, Jul 11, 2016 at 8:38 PM, Lior Zeno  wrote:

> It's weird, not sure why the test was killed. Maybe the output file has a
> few more hints.
>
> Could you please upload the output file? I can't access it since it
> requires a Jenkins user.
> On Jul 12, 2016 00:51, "Mike Percy"  wrote:
>
> TestFileChannel was killed for some reason on this run.
>
> From https://builds.apache.org/job/Flume-trunk-hbase-1/177/consoleFull :
>
> ---
>  T E S T S
> ---
> Running org.apache.flume.channel.file.TestLogFile
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.997 sec
> Running org.apache.flume.channel.file.TestFileChannel
> Killed
>
>
> I wonder if the OOM killer got it or something?
>
> Not a lot to go on from the test log:
>
> https://builds.apache.org/job/Flume-trunk-hbase-1/ws/flume-ng-channels/flume-file-channel/target/surefire-reports/org.apache.flume.channel.file.TestLogFile-output.txt
>
> Except this at the top:
>
> 2016-07-11 20:51:58,683 (main) [INFO -
>
> org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize(DirectMemoryUtils.java:112)]
> Unable to get maxDirectMemory from VM: NoSuchMethodException:
> sun.misc.VM.maxDirectMemory(null)
> 2016-07-11 20:51:58,698 (main) [INFO -
>
> org.apache.flume.tools.DirectMemoryUtils.allocate(DirectMemoryUtils.java:48)]
> Direct Memory Allocation:  Allocation = 1048576, Allocated = 0,
> MaxDirectMemorySize = 5616697344, Remaining = 5616697344
>
> I wonder if that could have anything to do with this. I don't have much
> time to investigate right now, though.
>
> Mike
>
> On Mon, Jul 11, 2016 at 2:24 PM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
> > See 
> >
> > --
> > [...truncated 3122 lines...]
> > Tests in error:
> >   TestFileChannel.testInOut:117 ?  Failed to locate tar-ball
> distribution.
> > Pleas...
> >   TestFileChannel.tearDown:95 ?  Failed to locate tar-ball distribution.
> > Please ...
> >   TestSpooldirSource.setup:59 ?  Failed to locate tar-ball distribution.
> > Please ...
> >   TestSpooldirSource.teardown:111 ?  Failed to locate tar-ball
> > distribution. Ple...
> >   TestRpcClient.setUp:36 ?  Failed to locate tar-ball distribution.
> Please
> > speci...
> >   TestRpcClient.tearDown:42 ?  Failed to locate tar-ball distribution.
> > Please sp...
> >   TestRpcClientCommunicationFailure.testFailure:61 ?  Failed to locate
> > tar-ball ...
> >   TestSyslogSource.testKeepFields:72 ?  Failed to locate tar-ball
> > distribution. ...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >   TestSyslogSource.testRemoveFields:82 ?  Failed to locate tar-ball
> > distribution...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >   TestSyslogSource.testKeepTimestampAndHostname:92 ?  Failed to locate
> > tar-ball ...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >   TestSyslogSource.testKeepFields:72 ?  Failed to locate tar-ball
> > distribution. ...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >   TestSyslogSource.testRemoveFields:82 ?  Failed to locate tar-ball
> > distribution...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >   TestSyslogSource.testKeepTimestampAndHostname:92 ?  Failed to locate
> > tar-ball ...
> >   TestSyslogSource.tearDown:63 ? NullPointer
> >
> > Tests run: 19, Failures: 0, Errors: 19, Skipped: 0
> >
> > [ERROR] There are test failures.
> >
> > Please refer to <
> >
>
> https://builds.apache.org/job/Flume-trunk-hbase-1/ws/flume-ng-tests/target/surefire-reports
> >
> > for the individual test results.
> > [JENKINS] Recording test results
> > [INFO]
> > [INFO] --- maven-jar-plugin:2.3.1:jar (default-jar) @ flume-ng-tests ---
> > [INFO] Building jar: <
> >
>
> https://builds.apache.org/job/Flume-trunk-hbase-1/ws/flume-ng-tests/target/flume-ng-tests-1.7.0-SNAPSHOT.jar
> > >
> > [INFO]
> > [INFO] --- apache-rat-plugin:0.11:check (verify.rat) @ flume-ng-tests ---
> > [INFO] 51 implicit excludes (use -debug for more details).
> > [INFO] Exclude: **/.idea/
> > [INFO] Exclude: **/*.iml
> > [INFO] Exclude: **/nb-configuration.xml
> > [INFO] Exclude: .git/
> > [INFO] Exclude: patchprocess/
> > [INFO] Exclude: .gitignore
> > [INFO] Exclude: .repository/
> > [INFO] Exclude: **/*.diff
> > [INFO] Exclude: **/*.patch
> > [INFO] Exclude: **/*.avsc
> > [INFO] Exclude: **/*.avro
> > [INFO] Exclude: **/docs/**
> > [INFO] Exclude: **/test/resources/**
> > [INFO] Exclude: **/.settings/*
> > [INFO] Exclude: **/.classpath
> > [INFO] Exclude: **/.project
> > [INFO] Exclude: **/target/**
> > [INFO] Exclude: **/derby.log
> > [INFO] Exclude: **/metastore_db/
> > [INFO] 9 resources included (use -debug for more