[ https://issues.apache.org/jira/browse/APEXMALHAR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15594249#comment-15594249 ]
Deepak Narkhede commented on APEXMALHAR-2312: --------------------------------------------- Issue reproduction with instrumentation logs: ============================================ 2016-10-21 10:35:35,227 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: isIterationCompleted Directory: null File: /user/deepak/CustomerTxnData 2016-10-21 10:35:35,228 ERROR com.datatorrent.lib.io.fs.FileSplitterInput: service java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333) at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988) at java.util.Collections$UnmodifiableMap.get(Collections.java:1339) at com.datatorrent.lib.io.fs.FileSplitterInput$TimeBasedDirectoryScanner.isIterationCompleted(FileSplitterInput.java:402) at com.datatorrent.lib.io.fs.FileSplitterInput$TimeBasedDirectoryScanner.run(FileSplitterInput.java:358) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-10-21 10:35:35,231 ERROR com.datatorrent.stram.engine.StreamingContainer: Operator set [OperatorDeployInfo[id=1,name=recordReader$FileSplitter,type=INPUT,checkpoint={ffffffffffffffff, 0, 0},inputs=[],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=blocksMetadataOutput,streamId=recordReader$BlockMetadata,bufferServer=deepak-HP-ProBook-650-G2]]]] stopped running due to an exception. java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333) at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988) at java.util.Collections$UnmodifiableMap.get(Collections.java:1339) at com.datatorrent.lib.io.fs.FileSplitterInput$TimeBasedDirectoryScanner.isIterationCompleted(FileSplitterInput.java:402) at com.datatorrent.lib.io.fs.FileSplitterInput$TimeBasedDirectoryScanner.run(FileSplitterInput.java:358) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Get methods implementations of ConcurrentHashMap and HashMap: ====================================================== ConcurrentHashMap<> get(): ------------------------- ... * * @throws NullPointerException if the specified key is null */ public V get(Object key) { Segment<K,V> s; // manually integrate access methods to reduce overhead HashEntry<K,V>[] tab; int h = hash(key); ... HashMap<> get(): --------------- public V get(Object key) { if (key == null) return getForNullKey(); Entry<K,V> entry = getEntry(key); ... Testing logs with fix for files/directories/sub-directories: ========================================== 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: /user/deepak/files Sub-Directory or File path: /user/deepak/files/CustomerTxnData2 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files 2016-10-21 11:20:38,386 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData 1477028632605 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData1 1477028642067 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData2 1477028645290 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete 0 3 .... 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: null Sub-Directory or File path: /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,702 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,704 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete > NullPointerException in FileSplitterInput only if the file path is specified > for attribute <files> instead of directory path > ---------------------------------------------------------------------------------------------------------------------------- > > Key: APEXMALHAR-2312 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2312 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > Priority: Minor > > Problem Statement: > ================== > NullPointerException seen in FileSplitterInput only if the file path is > specified for attribute <files> instead of directory path. > Description: > =========== > 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the > directories/files. > 2) Each thread checks with help of isIterationCompleted() [referenceTimes] > method whether scanned of last iteration are processed by operator thread. > 3) Previously it used to work because HashMap (referenceTimes) used to return > null even if last scanned directory path is null. > 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't > allow null key's passed to ConcurrentHashMap get() method. > 5) Hence NullPointerException is seen as if only file path is provided > directory path would be empty hence key would be empty. > Solution: > ======== > Pre-check that directory path is null then we have completed last iterations > if only filepath is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)