[ 
https://issues.apache.org/jira/browse/FLUME-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stéphane Moreau updated FLUME-1814:
-----------------------------------

    Description: 
It is not possible in the version 1.3.0 of Flume to parse UK or US date from a 
French computer using the interceptor 
{{RegexExtractorInterceptorMillisSerializer}}.

Indeed, the {{DateTimeFormatter}} created in the interceptor is currently using 
the default Locale which is FR on my computer. When I try to parse some files I 
got from US, I got the following exception:
{code}
2012-12-31 17:09:13,370 (pool-5-thread-1) [ERROR - 
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:148)]
 Uncaught exception in Runnable
java.lang.IllegalArgumentException: Invalid format: "29/Dec/2012:05:09:34 
-0700" is malformed at "Dec/2012:05:09:34 -0700"
        at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:866)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer.serialize(RegexExtractorInterceptorMillisSerializer.java:48)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:147)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:158)
        at 
org.apache.flume.interceptor.InterceptorChain.intercept(InterceptorChain.java:62)
        at 
org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:146)
        at 
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:143)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
{code}

The solution I propose is to add a new property called "language" to the 
interceptor which will allow us to override the default Locale.

  was:
It is not possible in the version 1.3.0 of Flume to parse UK or US date from a 
French computer using the interceptor 
{{RegexExtractorInterceptorMillisSerializer}}.

Indeed, the {{DateTimeFormatter}} created in the interceptor is currently using 
the default Locale which is FR on my computer. When I try to parse some file I 
got from US, I got the following exception:
{code}
2012-12-31 17:09:13,370 (pool-5-thread-1) [ERROR - 
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:148)]
 Uncaught exception in Runnable
java.lang.IllegalArgumentException: Invalid format: "29/Dec/2012:05:09:34 
-0700" is malformed at "Dec/2012:05:09:34 -0700"
        at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:866)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer.serialize(RegexExtractorInterceptorMillisSerializer.java:48)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:147)
        at 
org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:158)
        at 
org.apache.flume.interceptor.InterceptorChain.intercept(InterceptorChain.java:62)
        at 
org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:146)
        at 
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:143)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
{code}

The solution I propose is to add a new property called "language" to the 
interceptor which will allow us to override the default Locale.

    
> Problem with the default Locale in RegexExtractorInterceptorMillisSerializer
> ----------------------------------------------------------------------------
>
>                 Key: FLUME-1814
>                 URL: https://issues.apache.org/jira/browse/FLUME-1814
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.3.0
>            Reporter: Stéphane Moreau
>
> It is not possible in the version 1.3.0 of Flume to parse UK or US date from 
> a French computer using the interceptor 
> {{RegexExtractorInterceptorMillisSerializer}}.
> Indeed, the {{DateTimeFormatter}} created in the interceptor is currently 
> using the default Locale which is FR on my computer. When I try to parse some 
> files I got from US, I got the following exception:
> {code}
> 2012-12-31 17:09:13,370 (pool-5-thread-1) [ERROR - 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:148)]
>  Uncaught exception in Runnable
> java.lang.IllegalArgumentException: Invalid format: "29/Dec/2012:05:09:34 
> -0700" is malformed at "Dec/2012:05:09:34 -0700"
>         at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:866)
>         at 
> org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer.serialize(RegexExtractorInterceptorMillisSerializer.java:48)
>         at 
> org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:147)
>         at 
> org.apache.flume.interceptor.RegexExtractorInterceptor.intercept(RegexExtractorInterceptor.java:158)
>         at 
> org.apache.flume.interceptor.InterceptorChain.intercept(InterceptorChain.java:62)
>         at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:146)
>         at 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:143)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
> {code}
> The solution I propose is to add a new property called "language" to the 
> interceptor which will allow us to override the default Locale.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to