[
https://issues.apache.org/jira/browse/OOZIE-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566514#comment-16566514
]
Peter Cseh commented on OOZIE-3304:
-----------------------------------
Tests have timed out again with this patch. Oozie-core tests ran for more than
5 hours!
{noformat}
[INFO]
[INFO] Tests run: 1477, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[ERROR] There was a timeout or other error in the fork
{noformat}
{noformat}
[INFO] Apache Oozie Share Lib Oozie ....................... SUCCESS [ 4.962 s]
[INFO] Apache Oozie Share Lib HCatalog .................... SUCCESS [ 1.483 s]
[INFO] Apache Oozie Share Lib Distcp ...................... SUCCESS [ 0.155 s]
[INFO] Apache Oozie Core .................................. SUCCESS [ 05:43 h]
[INFO] Apache Oozie Share Lib Streaming ................... SUCCESS [05:51 min]
[INFO] Apache Oozie Share Lib Pig ......................... SUCCESS [04:49 min]
[INFO] Apache Oozie Share Lib Hive ........................ SUCCESS [02:27 min]
[INFO] Apache Oozie Share Lib Hive 2 ...................... SUCCESS [01:53 min]
[INFO] Apache Oozie Share Lib Sqoop ....................... SUCCESS [03:04 min]
{noformat}
There were successful pre-commit builds yesterday for other patches (like
OOZIE-3193) so I'm pretty confident this patch is causing something to go
wrong. I can't pinpoint what is the problem in the patch itself though.
[~dionusos], can you please do some testing on this?
> Parsing sharelib timestamps is not threadsafe
> ---------------------------------------------
>
> Key: OOZIE-3304
> URL: https://issues.apache.org/jira/browse/OOZIE-3304
> Project: Oozie
> Issue Type: Bug
> Components: core
> Affects Versions: 5.0.0b1, 4.3.1
> Reporter: Denes Bodo
> Assignee: Denes Bodo
> Priority: Critical
> Labels: usability
> Attachments: OOZIE-3304-001.patch, OOZIE-3304-002.patch
>
>
> In rare cases the following Exception can be read in log files when an action
> fails:
> {code:java}
> org.apache.oozie.action.ActionExecutorException: NumberFormatException:
> multiple points
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1271)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1472)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
> at org.apache.oozie.command.XCommand.call(XCommand.java:287)
> at
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
> at
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NumberFormatException: multiple points
> at
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1890)
> at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
> at java.lang.Double.parseDouble(Double.java:538)
> at java.text.DigitList.getDouble(DigitList.java:169)
> at java.text.DecimalFormat.parse(DecimalFormat.java:2056)
> at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2160)
> at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1514)
> at java.text.DateFormat.parse(DateFormat.java:364)
> at
> org.apache.oozie.service.ShareLibService.getLatestLibPath(ShareLibService.java:727)
> at
> org.apache.oozie.service.ShareLibService.getShareLibRootPath(ShareLibService.java:595)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.getSharelibRoot(JavaActionExecutor.java:1297)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.initShareLibExcluder(JavaActionExecutor.java:858)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.setLibFilesArchives(JavaActionExecutor.java:866)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1156)
> ... 11 more
> {code}
> or
> {code:java}
> 2018-07-12 04:48:52,649 WARN ForkedActionStartXCommand:523 -
> SERVER[ctr-e138-1518143905142-410551-01-000003.hwx.site] USER[user] GROUP[-]
> TOKEN[] APP[demo-wf] JOB[0000023-180712043119670-oozie-oozi-W]
> ACTION[0000023-180712043119670-oozie-oozi-W@streaming-node] Error starting
> action [streaming-node]. ErrorType [ERROR], ErrorCode
> [NumberFormatException], Message [NumberFormatException: For input string: ""]
> org.apache.oozie.action.ActionExecutorException: NumberFormatException: For
> input string: ""
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1271)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1472)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
> at
> org.apache.oozie.command.wf.ForkedActionStartXCommand.execute(ForkedActionStartXCommand.java:41)
> at
> org.apache.oozie.command.wf.ForkedActionStartXCommand.execute(ForkedActionStartXCommand.java:30)
> at org.apache.oozie.command.XCommand.call(XCommand.java:287)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NumberFormatException: For input string: ""
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:601)
> at java.lang.Long.parseLong(Long.java:631)
> at java.text.DigitList.getLong(DigitList.java:195)
> at java.text.DecimalFormat.parse(DecimalFormat.java:2051)
> at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2160)
> at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1514)
> at java.text.DateFormat.parse(DateFormat.java:364)
> at
> org.apache.oozie.service.ShareLibService.getLatestLibPath(ShareLibService.java:727)
> at
> org.apache.oozie.service.ShareLibService.getShareLibRootPath(ShareLibService.java:595)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.getSharelibRoot(JavaActionExecutor.java:1297)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.initShareLibExcluder(JavaActionExecutor.java:858)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.setLibFilesArchives(JavaActionExecutor.java:866)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1156)
> ... 10 more
> {code}
> Doing a little research found
> [https://docs.oracle.com/javase/7/docs/api/java/text/DateFormat.html] where
> the Synchronization block describes we shall use different DateFormat
> instances when parsing timestamps in SharelibService class.
> I'll try reproducing the issue in UT, for example having thousands of
> sharelibs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)