[jira] [Created] (FLUME-2748) ThriftLegacySource produces exception due to wrongly compiled thrift definitions

2015-07-27 Thread Tobias Heintz (JIRA)
Tobias Heintz created FLUME-2748:


 Summary: ThriftLegacySource produces exception due to wrongly 
compiled thrift definitions
 Key: FLUME-2748
 URL: https://issues.apache.org/jira/browse/FLUME-2748
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.6.0
Reporter: Tobias Heintz


We are in the process of upgrading our Flume installation from 0.9.2 to 1.6.0. 
Currently we are using the ThriftLegacySource to allow the Flume server to 
receive messages without having to update all components at the same time. For 
every received message, we are seeing this exception:
{code}
2015-07-24 17:15:28,892 (pool-3-thread-5) [ERROR - 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:215)]
 Error occurred during processing of message.

java.lang.NullPointerException
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

I've done some digging in the code and it appears that there is an error in the 
Java classes that were compiled from the legacy thrift definitions: the method 
[{{append}} is defined as 
{{oneway}}|https://github.com/apache/flume/blob/344e0accae5675fd3d14b8414531528607865aae/flume-ng-legacy-sources/flume-thrift-source/src/main/thrift/flumeCompatibility.thrift#L61],
 however in the compiled class, the method [{{isOneway()}} returns 
{{false}}|https://github.com/apache/flume/blob/344e0accae5675fd3d14b8414531528607865aae/flume-ng-legacy-sources/flume-thrift-source/src/main/java/com/cloudera/flume/handlers/thrift/ThriftFlumeEventServer.java#L223].
 This then leads to the NullPointerException, when the [ProcessFunction tries 
to write the 
result|https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/ProcessFunction.java#L53]
 back to the producer.

I'm not sure how this happened, maybe the very old version (0.7) of the thrift 
compiler is at fault here. The fix however would be to simply make the 
{{isOneway()}} method return {{true}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2458) Separate hdfs tmp directory for flume hdfs sink

2015-07-27 Thread Neerja Khattar (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neerja Khattar updated FLUME-2458:
--
Assignee: Neerja Khattar

 Separate hdfs tmp directory for flume hdfs sink
 ---

 Key: FLUME-2458
 URL: https://issues.apache.org/jira/browse/FLUME-2458
 Project: Flume
  Issue Type: Improvement
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Sverre Bakke
Assignee: Neerja Khattar
Priority: Minor
 Attachments: FLUME-2458.patch, patch-2458.txt


 The current HDFS sink will write temporary files to the same directory as the 
 final file will be stored. This is a problem for several reasons:
 1) File moving
 When mapreduce fetches a list of files to be processed and then processes 
 files that are then gone (i.e. are moved from .tmp to  whatever final name it 
 is suppose to have), then the mapreduce job will crash.
 2) File type
 When mapreduce decides how to process files, then it looks at files 
 extension. If using compressed files, then it will decompress it for you. If 
 the file has a .tmp file extension (in the same folder) then it will treat a 
 compressed file as an uncompressed files, thus breaking the results of the 
 mapreduce job.
 I propose that the sink gets an optional tmp path for storing these files to 
 avoid these issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Johny Rufus (JIRA)
Johny Rufus created FLUME-2749:
--

 Summary: Kerberos configuration error when using short names in 
multiple HDFS Sinks
 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus


When we have more thank one HDFS Sink, configured in kerberos mode, and 
principal is configured with a short name like 'flume' (without the @REALM 
information), we get a 

java.lang.IllegalStateException: Cannot use multiple kerberos principals in the 
same agent.  Must restart agent to use new principal or keytab. Previous = 
fl...@example.com (auth:KERBEROS), New = flume
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at 
org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
at 
org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
at 
org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643838#comment-14643838
 ] 

ASF subversion and git services commented on FLUME-2749:


Commit 1161b044930579ebb803685753eb5b3363ee5178 in flume's branch 
refs/heads/flume-1.7 from [~hshreedharan]
[ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1161b04 ]

FLUME-2749. Fix kerberos configuration error when using short names in multiple 
HDFS Sinks

(Johny Rufus via Hari)


 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643839#comment-14643839
 ] 

Hari Shreedharan commented on FLUME-2749:
-

Committed! Thanks Johny!

 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.7.0

 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643875#comment-14643875
 ] 

Hudson commented on FLUME-2749:
---

UNSTABLE: Integrated in Flume-trunk-hbase-1 #114 (See 
[https://builds.apache.org/job/Flume-trunk-hbase-1/114/])
FLUME-2749. Fix kerberos configuration error when using short names in multiple 
HDFS Sinks (hshreedharan: 
http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.gita=commith=a4946111383b3dfdb4c128fe5390ff3983213cbb)
* flume-ng-auth/src/main/java/org/apache/flume/auth/FlumeAuthenticationUtil.java
* flume-ng-auth/src/main/java/org/apache/flume/auth/KerberosAuthenticator.java
* flume-ng-auth/src/test/java/org/apache/flume/auth/TestFlumeAuthenticator.java
* flume-ng-auth/src/main/java/org/apache/flume/auth/KerberosUser.java


 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.7.0

 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build became unstable: Flume-trunk-hbase-1 #114

2015-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/Flume-trunk-hbase-1/114/changes



Talk About RegexExtractorInterceptorSerializer Implements's Thread Safety

2015-07-27 Thread inOceans
hello !
A Chinese with pool English, be careful !
My github about Flume : https://github.com/hotfey/flume.ng.1.5.2

Four sources flume on 4 machines, with logs files as sources, avro as sinks.
One sink flume on an other machine, with avro as source, hdfs as sink.
Create a class implements RegexExtractorInterceptorSerializer, that is the 
annex(also see github).

My logs files start with timestamp every line, so as events.
I implements RegexExtractorInterceptorSerializer,  just want to create 
directorys reference the timestamp in hdfs.
(e.g. A timestamp 28/Jul/2015, will create a hdfs directory .../2015/07/28)
But, when i start all the flumes, i do not know how to ensure the thread safety 
about my implements.
(e.g. If one of the sources machines's timestamp is 28/Jul/2015, and an other's 
21/Jun/2015,
The fact, may create .../2015/06/21, .../2015/06/28, .../2015/07/21 or 
.../2015/07/28.)

Can you give me some advices about it.

That's all, Thanks !
The Best Wishes For You !

[jira] [Updated] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Johny Rufus (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johny Rufus updated FLUME-2749:
---
Attachment: FLUME-2749.patch

Modified to pre-1.6 style of checking if the current user trying to log in, is 
different than the already logged in user (Using KerberosUser class, that 
stores the configured Principal and keytab)

 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2749) Kerberos configuration error when using short names in multiple HDFS Sinks

2015-07-27 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643812#comment-14643812
 ] 

Hari Shreedharan commented on FLUME-2749:
-

Looks good. Running tests now.

 Kerberos configuration error when using short names in multiple HDFS Sinks
 --

 Key: FLUME-2749
 URL: https://issues.apache.org/jira/browse/FLUME-2749
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: Johny Rufus
Assignee: Johny Rufus
 Attachments: FLUME-2749.patch


 When we have more thank one HDFS Sink, configured in kerberos mode, and 
 principal is configured with a short name like 'flume' (without the @REALM 
 information), we get a 
 java.lang.IllegalStateException: Cannot use multiple kerberos principals in 
 the same agent.  Must restart agent to use new principal or keytab. Previous 
 = fl...@example.com (auth:KERBEROS), New = flume
   at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:172)
   at 
 org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:131)
   at 
 org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:67)
   at 
 org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)