[ 
https://issues.apache.org/jira/browse/OOZIE-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563785#comment-16563785
 ] 

Zsombor Gegesy commented on OOZIE-3304:
---------------------------------------

It seems to be a network issue in the tests:

{code}
java.util.concurrent.ExecutionException: java.net.ConnectException: Call From 
asf917.gq1.ygridcore.net/67.195.81.137 to localhost:40116 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.oozie.tools.TestCopyTaskCallable.performAndCheckCallCopyTask(TestCopyTaskCallable.java:130)
        at 
org.apache.oozie.tools.TestCopyTaskCallable.testCallCopyTaskMoreFilesThanThreads(TestCopyTaskCallable.java:100)
{code}

The thread local variable initialization only happens, when it first accessed, 
so just creating hundreds of threads, shouldn't create hundreds of new 
SimpleDateFormat, and shouldn't trigger timeouts.

> Parsing sharelib timestamps is not threadsafe
> ---------------------------------------------
>
>                 Key: OOZIE-3304
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3304
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5.0.0b1, 4.3.1
>            Reporter: Denes Bodo
>            Assignee: Denes Bodo
>            Priority: Critical
>              Labels: usability
>         Attachments: OOZIE-3304-001.patch, OOZIE-3304-002.patch
>
>
> In rare cases the following Exception can be read in log files when an action 
> fails:
> {code:java}
> org.apache.oozie.action.ActionExecutorException: NumberFormatException: 
> multiple points
>       at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1271)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1472)
>       at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
>       at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
>       at org.apache.oozie.command.XCommand.call(XCommand.java:287)
>       at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
>       at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NumberFormatException: multiple points
>       at 
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1890)
>       at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
>       at java.lang.Double.parseDouble(Double.java:538)
>       at java.text.DigitList.getDouble(DigitList.java:169)
>       at java.text.DecimalFormat.parse(DecimalFormat.java:2056)
>       at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2160)
>       at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1514)
>       at java.text.DateFormat.parse(DateFormat.java:364)
>       at 
> org.apache.oozie.service.ShareLibService.getLatestLibPath(ShareLibService.java:727)
>       at 
> org.apache.oozie.service.ShareLibService.getShareLibRootPath(ShareLibService.java:595)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.getSharelibRoot(JavaActionExecutor.java:1297)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.initShareLibExcluder(JavaActionExecutor.java:858)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.setLibFilesArchives(JavaActionExecutor.java:866)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1156)
>       ... 11 more
> {code}
> or
> {code:java}
> 2018-07-12 04:48:52,649  WARN ForkedActionStartXCommand:523 - 
> SERVER[ctr-e138-1518143905142-410551-01-000003.hwx.site] USER[user] GROUP[-] 
> TOKEN[] APP[demo-wf] JOB[0000023-180712043119670-oozie-oozi-W] 
> ACTION[0000023-180712043119670-oozie-oozi-W@streaming-node] Error starting 
> action [streaming-node]. ErrorType [ERROR], ErrorCode 
> [NumberFormatException], Message [NumberFormatException: For input string: ""]
> org.apache.oozie.action.ActionExecutorException: NumberFormatException: For 
> input string: ""
>       at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1271)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1472)
>       at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
>       at 
> org.apache.oozie.command.wf.ForkedActionStartXCommand.execute(ForkedActionStartXCommand.java:41)
>       at 
> org.apache.oozie.command.wf.ForkedActionStartXCommand.execute(ForkedActionStartXCommand.java:30)
>       at org.apache.oozie.command.XCommand.call(XCommand.java:287)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NumberFormatException: For input string: ""
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Long.parseLong(Long.java:601)
>       at java.lang.Long.parseLong(Long.java:631)
>       at java.text.DigitList.getLong(DigitList.java:195)
>       at java.text.DecimalFormat.parse(DecimalFormat.java:2051)
>       at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2160)
>       at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1514)
>       at java.text.DateFormat.parse(DateFormat.java:364)
>       at 
> org.apache.oozie.service.ShareLibService.getLatestLibPath(ShareLibService.java:727)
>       at 
> org.apache.oozie.service.ShareLibService.getShareLibRootPath(ShareLibService.java:595)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.getSharelibRoot(JavaActionExecutor.java:1297)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.initShareLibExcluder(JavaActionExecutor.java:858)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.setLibFilesArchives(JavaActionExecutor.java:866)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1156)
>       ... 10 more
> {code}
> Doing a little research found 
> [https://docs.oracle.com/javase/7/docs/api/java/text/DateFormat.html] where 
> the Synchronization block describes we shall use different DateFormat 
> instances when parsing timestamps in SharelibService class.
> I'll try reproducing the issue in UT, for example having thousands of 
> sharelibs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to