[jira] [Created] (IGNITE-5410) Invocation of HadoopOutputStream.write() with an empty array сauses an AssertionError.
Ivan Veselovsky created IGNITE-5410: --- Summary: Invocation of HadoopOutputStream.write() with an empty array сauses an AssertionError. Key: IGNITE-5410 URL: https://issues.apache.org/jira/browse/IGNITE-5410 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 2.1 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Priority: Minor Writing an array of zero length causes the following AssertionError: {code} java.lang.AssertionError: 0 at org.apache.ignite.internal.processors.hadoop.shuffle.streams.HadoopOffheapBuffer.move(HadoopOffheapBuffer.java:95) at org.apache.ignite.internal.processors.hadoop.shuffle.streams.HadoopDataOutStream.move(HadoopDataOutStream.java:55) at org.apache.ignite.internal.processors.hadoop.shuffle.collections.HadoopMultimapBase$AdderBase$1.move(HadoopMultimapBase.java:206) at org.apache.ignite.internal.processors.hadoop.shuffle.streams.HadoopDataOutStream.write(HadoopDataOutStream.java:70) at org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:187) ... {code} Suggested fix is to change the assertion to {code} assert size > 0 : size; {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-5193) Hadoop: Ignite node fails to start if some , but not all HADOOP_XXX_HOME variables are set.
Ivan Veselovsky created IGNITE-5193: --- Summary: Hadoop: Ignite node fails to start if some , but not all HADOOP_XXX_HOME variables are set. Key: IGNITE-5193 URL: https://issues.apache.org/jira/browse/IGNITE-5193 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 1.8 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 2.1 Ignite node fails to start if some , but not all 3 HADOOP_XXX_HOME variables are set (see trace below). This is caused by the following gap in Ignite logic: org.apache.ignite.internal.processors.hadoop.HadoopClasspathUtils#exists returns {true} for empty String argument. For the unset location variables the value is empty string, but {org.apache.ignite.internal.processors.hadoop.HadoopLocations#xExists} gets {true}. This is the cause of the problem. {code} [06:09:42] Security status [authentication=off, tls/ssl=off] [06:17:23,822][ERROR][main][IgniteKernal] Got exception while starting (will rollback startup routine). java.lang.RuntimeException: class org.apache.ignite.IgniteCheckedException: Failed to resolve Hadoop JAR locations: Failed to get directory files [dir=] at org.apache.ignite.internal.processors.hadoop.HadoopClassLoader.addHadoopUrls(HadoopClassLoader.java:422) at org.apache.ignite.internal.processors.hadoop.HadoopClassLoader.(HadoopClassLoader.java:134) at org.apache.ignite.internal.processors.hadoop.HadoopHelperImpl.commonClassLoader(HadoopHelperImpl.java:78) at org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem.start(IgniteHadoopIgfsSecondaryFileSystem.java:254) at org.apache.ignite.internal.processors.igfs.IgfsImpl.(IgfsImpl.java:186) at org.apache.ignite.internal.processors.igfs.IgfsContext.(IgfsContext.java:101) at org.apache.ignite.internal.processors.igfs.IgfsProcessor.start(IgfsProcessor.java:128) at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1638) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:900) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1799) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1602) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1042) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:964) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:850) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:749) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:619) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:589) at org.apache.ignite.Ignition.start(Ignition.java:347) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:302) Caused by: class org.apache.ignite.IgniteCheckedException: Failed to resolve Hadoop JAR locations: Failed to get directory files [dir=] at org.apache.ignite.internal.processors.hadoop.HadoopClassLoader.hadoopUrls(HadoopClassLoader.java:456) at org.apache.ignite.internal.processors.hadoop.HadoopClassLoader.addHadoopUrls(HadoopClassLoader.java:419) ... 18 more Caused by: java.io.IOException: Failed to get directory files [dir=] at org.apache.ignite.internal.processors.hadoop.HadoopClasspathUtils$SearchDirectory.files(HadoopClasspathUtils.java:344) at org.apache.ignite.internal.processors.hadoop.HadoopClasspathUtils.classpathForClassLoader(HadoopClasspathUtils.java:68) at org.apache.ignite.internal.processors.hadoop.HadoopClassLoader.hadoopUrls(HadoopClassLoader.java:453) ... 19 more {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-5168) Expose IGFS metrics via an MBean or console Visor.
Ivan Veselovsky created IGNITE-5168: --- Summary: Expose IGFS metrics via an MBean or console Visor. Key: IGNITE-5168 URL: https://issues.apache.org/jira/browse/IGNITE-5168 Project: Ignite Issue Type: New Feature Components: IGFS Affects Versions: 2.0 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Now in Ignite the only way to see IGFS metrics is to get them programmically from the code. But customers need them to understand what is cached in IGFS, so to expose the metrics would be useful. There are options to expose the metrics via MBean and/or via console Visor interface. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-5131) Hadoop: update asm library to a version that can parse 1.8 bytecode
Ivan Veselovsky created IGNITE-5131: --- Summary: Hadoop: update asm library to a version that can parse 1.8 bytecode Key: IGNITE-5131 URL: https://issues.apache.org/jira/browse/IGNITE-5131 Project: Ignite Issue Type: Bug Components: hadoop Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Priority: Minor This question indicates, that asm bytecode parser 4.2 cannot parse 1.8 bytecode: http://stackoverflow.com/questions/34318028/apache-ignite-failed-to-load-job-class-class-org-apache-ignite-internal-proces org.ow2.asm asm-all 4.2 Consider to update asm lib to newer version that understands 1.8 bytecode version, (likely 5.2 ? http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.ow2.asm%22%20AND%20a%3A%22asm-all%22 ) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4813) Ignite map-reduce engine should set MRJobConfig.TASK_ATTEMPT_ID
Ivan Veselovsky created IGNITE-4813: --- Summary: Ignite map-reduce engine should set MRJobConfig.TASK_ATTEMPT_ID Key: IGNITE-4813 URL: https://issues.apache.org/jira/browse/IGNITE-4813 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 1.8 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 2.0 Hadoop "join" example fails on Ignite with the error like this: {code} Out: class org.apache.ignite.IgniteCheckedException: class org.apache.ignite.IgniteCheckedException: null [14:27:29,636][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:102) [14:27:29,636][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2Task.run(HadoopV2Task.java:55) [14:27:29,636][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:266) [14:27:29,636][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:209) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0(HadoopRunnableTask.java:144) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:116) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:114) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.runAsJobOwner(HadoopV2TaskContext.java:573) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:114) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:46) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:186) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [14:27:29,637][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at java.lang.Thread.run(Thread.java:745) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: Caused by: java.lang.NullPointerException [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.(TaskAttemptContextImpl.java:49) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.hadoop.mapreduce.lib.join.Parser$WNode.createRecordReader(Parser.java:348) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.hadoop.mapreduce.lib.join.Parser$CNode.createRecordReader(Parser.java:486) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat.createRecordReader(CompositeInputFormat.java:143) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:69) [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: ... 12 more [14:27:29,638][INFO ][Thread-3][jvm-a6fc1c46] PID-31907 Out: {code} This is because org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2Context sets the job id and task id, but does not set task attempt id. In Hadoop this is done in method org.apache.hadoop.mapred.Task#localizeConfiguration . -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4808) Add all Hadoop examples as Ignite unit tests with default multi-JVM execution mode
Ivan Veselovsky created IGNITE-4808: --- Summary: Add all Hadoop examples as Ignite unit tests with default multi-JVM execution mode Key: IGNITE-4808 URL: https://issues.apache.org/jira/browse/IGNITE-4808 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 2.0 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Should have all Hadoop examples as Ignite unit tests with multi-JVM mode being the default. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4755) Print a warning when option 'ignite.job.shared.classloader' is on.
Ivan Veselovsky created IGNITE-4755: --- Summary: Print a warning when option 'ignite.job.shared.classloader' is on. Key: IGNITE-4755 URL: https://issues.apache.org/jira/browse/IGNITE-4755 Project: Ignite Issue Type: Bug Components: hadoop Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4623) IGFS test suite should run successfully on Windows agents
Ivan Veselovsky created IGNITE-4623: --- Summary: IGFS test suite should run successfully on Windows agents Key: IGNITE-4623 URL: https://issues.apache.org/jira/browse/IGNITE-4623 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Taras Ledkov Currently on Windows agents (*_*_*_9090) there are ~560 failures related to local file system behavior difference on Windows systems. E.g. see 1.8.3: http://ci.ignite.apache.org/viewLog.html?buildId=435288=buildResultsDiv=IgniteTests_IgniteGgfs 2.0: http://ci.ignite.apache.org/viewLog.html?buildId=435289=buildResultsDiv=IgniteTests_IgniteGgfs . Most of the failures are caused by NPE in org.apache.ignite.igfs.secondary.local.LocalIgfsSecondaryFileSystem#info (line 370). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4514) Test HadoopCommandLineTest.testHiveCommandLine fails
Ivan Veselovsky created IGNITE-4514: --- Summary: Test HadoopCommandLineTest.testHiveCommandLine fails Key: IGNITE-4514 URL: https://issues.apache.org/jira/browse/IGNITE-4514 Project: Ignite Issue Type: Test Components: hadoop Affects Versions: 1.8 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 2.0 test HadoopCommandLineTest.testHiveCommandLine reproducibly fails due to failed assertion: {code} java.lang.AssertionError at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.readExternalSplit(HadoopV2TaskContext.java:505) at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.getNativeSplit(HadoopV2TaskContext.java:483) at org.apache.ignite.internal.processors.hadoop.impl.v1.HadoopV1MapTask.run(HadoopV1MapTask.java:75) at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:257) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:201) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0(HadoopRunnableTask.java:144) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:116) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.runAsJobOwner(HadoopV2TaskContext.java:569) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:46) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:186) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {code} problem is that org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext is loaded by the Job class loader if the class loader is "shared" (see org.apache.ignite.internal.processors.hadoop.HadoopJobProperty#JOB_SHARED_CLASSLOADER), and this is true by default. But the assertion in the test expects this to be task class loader, what can be true, but is false by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4386) Hadoop tests affect each other through IgniteHadoopClientProtocolProvider#cliMap
Ivan Veselovsky created IGNITE-4386: --- Summary: Hadoop tests affect each other through IgniteHadoopClientProtocolProvider#cliMap Key: IGNITE-4386 URL: https://issues.apache.org/jira/browse/IGNITE-4386 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 1.7 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 Tests affect each other through map org.apache.ignite.hadoop.mapreduce.IgniteHadoopClientProtocolProvider#cliMap that never clears. For example, test org.apache.ignite.internal.processors.hadoop.impl.client.HadoopClientProtocolMultipleServersSelfTest#testSingleAddress sometimes fails if test org.apache.ignite.internal.processors.hadoop.impl.client.HadoopClientProtocolSelfTest#testJobCounters runs before it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4355) Eliminate map threads pauses during startup
Ivan Veselovsky created IGNITE-4355: --- Summary: Eliminate map threads pauses during startup Key: IGNITE-4355 URL: https://issues.apache.org/jira/browse/IGNITE-4355 Project: Ignite Issue Type: Sub-task Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Pauses in all Map threads but one are observed in the beginning . This is caused by waiting on future.get() in HadoopV2Job.getTaskContext(HadoopTaskInfo) . {code} at sun.misc.Unsafe.park(boolean, long) at java.util.concurrent.locks.LockSupport.park(Object) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(int) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(int) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(boolean) at org.apache.ignite.internal.util.future.GridFutureAdapter.get() at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2Job.getTaskContext(HadoopTaskInfo) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffleJob.(Object, IgniteLogger, HadoopJob, GridUnsafeMemory, int, int[], int) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.newJob(HadoopJobId) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.job(HadoopJobId) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.output(HadoopTaskContext) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopEmbeddedTaskExecutor$1.createOutput(HadoopTaskContext) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.createOutputInternal(HadoopTaskContext) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopPerformanceCounter) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0() at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call() at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call() at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.runAsJobOwner(Callable) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call() at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call() at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body() at org.apache.ignite.internal.util.worker.GridWorker.run() at java.lang.Thread.run() {code} while the working thread initializes the context: {code} Java Monitor Wait at java.lang.Object.wait(long) at java.lang.Thread.join(long) at java.lang.Thread.join() at org.apache.hadoop.util.Shell.joinThread(Thread) at org.apache.hadoop.util.Shell.runCommand() at org.apache.hadoop.util.Shell.run() at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute() at org.apache.hadoop.util.Shell.isSetsidSupported() at org.apache.hadoop.util.Shell.() at org.apache.hadoop.util.StringUtils.() at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(Configuration) at org.apache.hadoop.security.UserGroupInformation.initialize(Configuration, boolean) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized() at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(Subject) at org.apache.hadoop.security.UserGroupInformation.getLoginUser() at org.apache.hadoop.security.UserGroupInformation.getCurrentUser() at org.apache.hadoop.mapreduce.task.JobContextImpl.(Configuration, JobID) at org.apache.hadoop.mapred.JobContextImpl.(JobConf, JobID, Progressable) at org.apache.hadoop.mapred.JobContextImpl.(JobConf, JobID) at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2TaskContext.(HadoopTaskInfo, HadoopJob, HadoopJobId, UUID, DataInput) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Constructor, Object[]) at sun.reflect.NativeConstructorAccessorImpl.newInstance(Object[]) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Object[]) at java.lang.reflect.Constructor.newInstance(Object[]) at org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopV2Job.getTaskContext(HadoopTaskInfo) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffleJob.(Object, IgniteLogger, HadoopJob, GridUnsafeMemory, int, int[], int) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.newJob(HadoopJobId) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.job(HadoopJobId) at org.apache.ignite.internal.processors.hadoop.shuffle.HadoopShuffle.output(HadoopTaskContext) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopEmbeddedTaskExecutor$1.createOutput(HadoopTaskContext) at
[jira] [Created] (IGNITE-4341) Add TeraSort example as a unit test to Ignite
Ivan Veselovsky created IGNITE-4341: --- Summary: Add TeraSort example as a unit test to Ignite Key: IGNITE-4341 URL: https://issues.apache.org/jira/browse/IGNITE-4341 Project: Ignite Issue Type: Test Components: hadoop Affects Versions: 1.7 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 Add canonical TeraSort example as a unit test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4098) Spilled map-reduce: reduce side.
Ivan Veselovsky created IGNITE-4098: --- Summary: Spilled map-reduce: reduce side. Key: IGNITE-4098 URL: https://issues.apache.org/jira/browse/IGNITE-4098 Project: Ignite Issue Type: Sub-task Components: hadoop Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.9 Implement spilling data on the reducers side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4097) Spilled map-reduce: map side.
Ivan Veselovsky created IGNITE-4097: --- Summary: Spilled map-reduce: map side. Key: IGNITE-4097 URL: https://issues.apache.org/jira/browse/IGNITE-4097 Project: Ignite Issue Type: Sub-task Components: hadoop Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 Implement spilled output on Map side of map-reduce. In general, algorithm should follow the one used in Hadoop. The difference on the Map side is that 1) we use sorting collection (Hadoop sorts a range of map outputs explicitly); 2) we store the map output in files not using FileSystem , but rather local files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4037) High memory consumption when execting TeraSort Hadoop example
Ivan Veselovsky created IGNITE-4037: --- Summary: High memory consumption when execting TeraSort Hadoop example Key: IGNITE-4037 URL: https://issues.apache.org/jira/browse/IGNITE-4037 Project: Ignite Issue Type: Bug Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.7 When executing TeraSort Hadoop example, we observe high memory consumption that frequently leads to cluster malfunction. The problem can be reproduced in unit test, even with 1 node, and with not huge input data set as 100Mb. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3998) IGFS: uncomment testCreateConsistencyMultithreaded
Ivan Veselovsky created IGNITE-3998: --- Summary: IGFS: uncomment testCreateConsistencyMultithreaded Key: IGNITE-3998 URL: https://issues.apache.org/jira/browse/IGNITE-3998 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 Test org.apache.ignite.internal.processors.igfs.IgfsAbstractSelfTest#testCreateConsistencyMultithreaded was commented out for unknown reason. Uncomment it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3966) Hadoop: automatically add ${HADOOP_HOME}/lib/native to java.library.path system property
Ivan Veselovsky created IGNITE-3966: --- Summary: Hadoop: automatically add ${HADOOP_HOME}/lib/native to java.library.path system property Key: IGNITE-3966 URL: https://issues.apache.org/jira/browse/IGNITE-3966 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 Now in order to use native libs user is expected to add -J-Djava.library.path explicitly upon node start. In most of the Hadoop distributions native libraries are found in ${HADOOP_HOME}/lib/native/ , and, if such directory exists, we can add -Djava.library.path paramater automatically. Note that if -Djava.library.path is also given explicitly by the user, we should merge his explicit value with our implicitly added value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3922) IGFS: test org.apache.ignite.internal.processors.igfs.IgfsTaskSelfTest hands on WiFi newtwork
Ivan Veselovsky created IGNITE-3922: --- Summary: IGFS: test org.apache.ignite.internal.processors.igfs.IgfsTaskSelfTest hands on WiFi newtwork Key: IGNITE-3922 URL: https://issues.apache.org/jira/browse/IGNITE-3922 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.7 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 The following 2 tests frequently hang: org.apache.ignite.internal.processors.igfs.IgfsTaskSelfTest#testTask org.apache.ignite.internal.processors.igfs.IgfsTaskSelfTest#testTaskAsync Also that happens on public TC. Thread dump does not show any specific reasoon of stuck, but it frequently contauns sttack like this: Thread [name="test-runner-#298%igfs.IgfsTaskSelfTest%", id=364, state=RUNNABLE, blockCnt=14, waitCnt=25] at java.net.Inet6AddressImpl.getHostByAddr(Native Method) at java.net.InetAddress$1.getHostByAddr(InetAddress.java:905) at java.net.InetAddress.getHostFromNameService(InetAddress.java:590) at java.net.InetAddress.getHostName(InetAddress.java:532) at java.net.InetAddress.getHostName(InetAddress.java:504) at o.a.i.i.processors.igfs.IgfsBlockLocationImpl.convertFromNodes(IgfsBlockLocationImpl.java:304) at o.a.i.i.processors.igfs.IgfsBlockLocationImpl.(IgfsBlockLocationImpl.java:101) at o.a.i.i.processors.igfs.IgfsDataManager.splitBlocks(IgfsDataManager.java:895) at o.a.i.i.processors.igfs.IgfsDataManager.affinity0(IgfsDataManager.java:862) at o.a.i.i.processors.igfs.IgfsDataManager.affinity(IgfsDataManager.java:738) at o.a.i.i.processors.igfs.IgfsImpl$18.call(IgfsImpl.java:1216) at o.a.i.i.processors.igfs.IgfsImpl$18.call(IgfsImpl.java:1191) at o.a.i.i.processors.igfs.IgfsImpl.safeOp(IgfsImpl.java:1679) at o.a.i.i.processors.igfs.IgfsImpl.affinity(IgfsImpl.java:1191) at o.a.i.igfs.mapreduce.IgfsTask.map(IgfsTask.java:116) at o.a.i.igfs.mapreduce.IgfsTask.map(IgfsTask.java:85) at o.a.i.i.processors.task.GridTaskWorker$2.call(GridTaskWorker.java:519) at o.a.i.i.processors.task.GridTaskWorker$2.call(GridTaskWorker.java:517) at o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6509) at o.a.i.i.processors.task.GridTaskWorker.body(GridTaskWorker.java:516) at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110) at o.a.i.i.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:678) at o.a.i.i.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:403) at o.a.i.i.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:385) at o.a.i.i.processors.igfs.IgfsImpl.executeAsync(IgfsImpl.java:1446) at o.a.i.i.processors.igfs.IgfsImpl.executeAsync(IgfsImpl.java:1427) at o.a.i.i.processors.igfs.IgfsImpl.execute(IgfsImpl.java:1375) at o.a.i.i.processors.igfs.IgfsTaskSelfTest.testTask(IgfsTaskSelfTest.java:171) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at o.a.i.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1760) at o.a.i.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:118) at o.a.i.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1698) at java.lang.Thread.run(Thread.java:745) measurements of method org.apache.ignite.internal.processors.igfs.IgfsBlockLocationImpl#convertFromNodes , duration show the following: : ... convertFromNodes Took: 39 ms convertFromNodes Took: 34 ms convertFromNodes Took: 40 ms convertFromNodes Took: 32 ms convertFromNodes Took: 39 ms convertFromNodes Took: 32 ms convertFromNodes Took: 32 ms convertFromNodes Took: 37 ms convertFromNodes Took: 31 ms convertFromNodes Took: 31 ms convertFromNodes Took: 5067 ms convertFromNodes Took: 33 ms convertFromNodes Took: 31 ms convertFromNodes Took: 137 ms convertFromNodes Took: 33 ms convertFromNodes Took: 30 ms convertFromNodes Took: 41 ms convertFromNodes Took: 35 ms convertFromNodes Took: 136 ms convertFromNodes Took: 71 ms convertFromNodes Took: 5037 ms convertFromNodes Took: 15056 ms convertFromNodes Took: 37 ms That is, addresses calculation sometimes takes as long as 15 seconds. Simple caching of the addresses proven to fix the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3877) Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize
Ivan Veselovsky created IGNITE-3877: --- Summary: Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize Key: IGNITE-3877 URL: https://issues.apache.org/jira/browse/IGNITE-3877 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.8 During Metrics tests repairing test org.apache.ignite.igfs.Hadoop1DualAbstractTest#testMetricsBlock revealed the following problem: org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) method treats groupBlockSize as blockSize for Hadoop FileStatus. groupBlockSize can be several times larger than blockSize, so blockSize in status gets different to that in original IgfsFile . changing file.groupBlockSize() to file.blockSize() fixes problem in metrics tests, but creates problems in Hadoop tests that are bound to splits calculation, since split calculation related to blockSizes. Ned to 1) clarify if the treatment of groupBlcokSize was intentional. 2) fix either metrics tests or Hadoop tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3872) Clarify #blockSize() behabviour in LocalFileSystemIgfsFile
Ivan Veselovsky created IGNITE-3872: --- Summary: Clarify #blockSize() behabviour in LocalFileSystemIgfsFile Key: IGNITE-3872 URL: https://issues.apache.org/jira/browse/IGNITE-3872 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Vladimir Ozerov Fix For: 1.8 LocalFileSystemIgfsFile constructor accepts blockSize parameter and there is a field to store the value, but zero always passed in, so LocalFileSystemIgfsFile#blockSize() is always zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3312) [Test] HadoopMapReduceEmbeddedSelfTest.testMultiReducerWholeMapReduceExecution flakily fails.
Ivan Veselovsky created IGNITE-3312: --- Summary: [Test] HadoopMapReduceEmbeddedSelfTest.testMultiReducerWholeMapReduceExecution flakily fails. Key: IGNITE-3312 URL: https://issues.apache.org/jira/browse/IGNITE-3312 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky HadoopMapReduceEmbeddedSelfTest.testMultiReducerWholeMapReduceExecution fails with ~20% probability . Failure cause is either the 1st or 2nd marked line in the following code (org.apache.ignite.internal.processors.igfs.IgfsMetaManager#create) : {code} // Check: can we overwrite it? if (!overwrite) throw new IgfsPathAlreadyExistsException("Failed to create a file: " + path); // * #1 // Check if file already opened for write. if (oldInfo.lockId() != null) throw new IgfsException("File is already opened for write: " + path); // * #2 {code} Diagnostic shows that the same file really attempted to be created several times on one thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3285) A reference to HadoopClassLoader may be held in IGFS service pool threads after Job finish.
Ivan Veselovsky created IGNITE-3285: --- Summary: A reference to HadoopClassLoader may be held in IGFS service pool threads after Job finish. Key: IGNITE-3285 URL: https://issues.apache.org/jira/browse/IGNITE-3285 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Memory profiling shows that an instance of HadoopClassLoader used for Hadoop job may still be referenced after that job finish. This happens for 2 reasons. 1) When a new thread in IGFS pool is created by a thread that has HadoopClassLoader as the current context class loader, this class loader inherently propagated as a context class loader to the created thread: {code} java.lang.Throwable: 00igfs-#85%null%.: set hadoop class loader: HadoopClassLoader [name=hadoop-task-6b4d1037-65df-4e83-a7f8-7338e13ab1cf_1-SETUP-0]. Current cl = HadoopClassLoader [name=hadoop-task-6b4d1037-65df-4e83-a7f8-7338e13ab1cf_1-SETUP-0] at org.apache.ignite.thread.IgniteThread.(IgniteThread.java:83) at org.apache.ignite.thread.IgniteThread.(IgniteThread.java:62) at org.apache.ignite.thread.IgniteThreadFactory$1.(IgniteThreadFactory.java:62) at org.apache.ignite.thread.IgniteThreadFactory.newThread(IgniteThreadFactory.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:610) at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:924) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360) at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132) at org.apache.ignite.internal.processors.igfs.IgfsDataManager.callIgfsLocalSafe(IgfsDataManager.java:1133) at org.apache.ignite.internal.processors.igfs.IgfsDataManager.processBatch(IgfsDataManager.java:1024) at org.apache.ignite.internal.processors.igfs.IgfsDataManager.access$2500(IgfsDataManager.java:100) at org.apache.ignite.internal.processors.igfs.IgfsDataManager$BlocksWriter.storeDataBlocks(IgfsDataManager.java:1416) at org.apache.ignite.internal.processors.igfs.IgfsDataManager.storeDataBlocks(IgfsDataManager.java:538) at org.apache.ignite.internal.processors.igfs.IgfsOutputStreamImpl.storeDataBlock(IgfsOutputStreamImpl.java:193) at org.apache.ignite.internal.processors.igfs.IgfsOutputStreamAdapter.sendData(IgfsOutputStreamAdapter.java:252) at org.apache.ignite.internal.processors.igfs.IgfsOutputStreamAdapter.write(IgfsOutputStreamAdapter.java:135) at org.apache.ignite.internal.processors.hadoop.igfs.HadoopIgfsInProc.writeData(HadoopIgfsInProc.java:440) at org.apache.ignite.internal.processors.hadoop.igfs.HadoopIgfsOutputStream.write(HadoopIgfsOutputStream.java:112) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1333) at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:83) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2Context.write(HadoopV2Context.java:144) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.examples.RandomTextWriter$RandomTextMapper.map(RandomTextWriter.java:140) at org.apache.hadoop.examples.RandomTextWriter$RandomTextMapper.map(RandomTextWriter.java:102) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:74) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2Task.run(HadoopV2Task.java:54) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:249) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:201) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0(HadoopRunnableTask.java:144) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:116) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.runAsJobOwner(HadoopV2TaskContext.java:544) at
[jira] [Created] (IGNITE-3186) [Test] org.apache.ignite.internal.processors.igfs.IgfsProcessorValidationSelfTest#testInvalidEndpointTcpPort fails on master
Ivan Veselovsky created IGNITE-3186: --- Summary: [Test] org.apache.ignite.internal.processors.igfs.IgfsProcessorValidationSelfTest#testInvalidEndpointTcpPort fails on master Key: IGNITE-3186 URL: https://issues.apache.org/jira/browse/IGNITE-3186 Project: Ignite Issue Type: Bug Components: IGFS Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky There is no such problem in 1.6 release, test only fails on master branch . Root cause is that an SPI now does not allow to start one instance more than once. Test does 3 checks, but uses one configuration instance for all them. First check passes, than the 2nd attempts to start the node, but unexpected error happens when an already started Spi is attempted to start. Caused by: class org.apache.ignite.IgniteCheckedException: SPI has already been started (always create new configuration instance for each starting Ignite instances) [spi=TcpCommunicationSpi [connectGate=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ConnectGateway@ceb47b, srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@2aaf7a7, locAddr=127.0.0.1, locHost=localhost/127.0.0.1, locPort=45050, locPortRange=100, shmemPort=48100, directBuf=true, directSndBuf=false, idleConnTimeout=3, connTimeout=5000, maxConnTimeout=60, reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=1024, slowClientQueueLimit=0, nioSrvr=null, shmemSrv=IpcSharedMemoryServerEndpoint [port=48100, tokDirPath=ipc/shmem, size=262144, tokDir=/mnt/tc_disk/temp/buildTmp/ignite/work/ipc/shmem/10fe89ee-1969-4aab-b91c-bb415b45c001-15910, locNodeId=10fe89ee-1969-4aab-b91c-bb415b45c001, gridName=g1, omitOutOfResourcesWarn=true, pid=15910, closed=true], tcpNoDelay=true, ackSndThreshold=16, unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null, boundTcpPort=-1, boundTcpShmemPort=48100, selectorsCnt=4, addrRslvr=null, rcvdMsgsCnt=0, sentMsgsCnt=0, rcvdBytesCnt=0, sentBytesCnt=0, ctxInitLatch=java.util.concurrent.CountDownLatch@2668f64f[Count = 0], stopping=true, metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$3@3502d03c]] at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:956) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1736) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1589) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1042) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:569) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:516) at org.apache.ignite.Ignition.start(Ignition.java:322) ... 11 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3063) IgfsClientCacheSelfTest.testFormat flakily fails
Ivan Veselovsky created IGNITE-3063: --- Summary: IgfsClientCacheSelfTest.testFormat flakily fails Key: IGNITE-3063 URL: https://issues.apache.org/jira/browse/IGNITE-3063 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.5.0.final Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 IgfsClientCacheSelfTest.testFormat flakily fails . Main problems with format() operation were fixed in IGNITE-586, but this problem is different: test fails in the very beginning, because number of entries in data cache is greater than zero. That is, the problem is that clean up performed after previous test has failed to clean up data cache completely. The cleanup mechanism and method of caches emptiness assertion should be re-implemented, because in #clean() method we use getMetaCache(igfs).keySet() and getDataCache(igfs).size() methods , that return only the number of *local* entries, while in the beginning of #testFormat() method we use dataCache.size(new CachePeekMode[] {CachePeekMode.ALL}); , that returns all the entries, and this assertion fails. {code} --- Stdout: --- [13:50:10,335][INFO ][main][root] >>> Starting test: IgfsClientCacheSelfTest#testFormat <<< [13:50:10,338][INFO ][main][root] >>> Stopping test: IgfsClientCacheSelfTest#testFormat in 3 ms <<< --- Stderr: --- [13:50:10,338][ERROR][main][root] Test failed. java.lang.AssertionError: Initial data cache size = 2 at org.apache.ignite.internal.processors.igfs.IgfsAbstractSelfTest.testFormat(IgfsAbstractSelfTest.java:983) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1759) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:118) at org.apache.ignite.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1697) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3021) Test org.apache.ignite.internal.processors.igfs.IgfsStreamsSelfTest#testCreateFileColocated fails sometimes.
Ivan Veselovsky created IGNITE-3021: --- Summary: Test org.apache.ignite.internal.processors.igfs.IgfsStreamsSelfTest#testCreateFileColocated fails sometimes. Key: IGNITE-3021 URL: https://issues.apache.org/jira/browse/IGNITE-3021 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Test org.apache.ignite.internal.processors.igfs.IgfsStreamsSelfTest#testCreateFileColocated fails sometimes on master branch: {code} --- Stderr: --- [15:16:32,400][ERROR][main][root] Test failed. junit.framework.AssertionFailedError: expected:<1> but was:<2> at junit.framework.Assert.fail(Assert.java:57) at junit.framework.Assert.failNotEquals(Assert.java:329) at junit.framework.Assert.assertEquals(Assert.java:78) at junit.framework.Assert.assertEquals(Assert.java:234) at junit.framework.Assert.assertEquals(Assert.java:241) at junit.framework.TestCase.assertEquals(TestCase.java:409) at org.apache.ignite.internal.processors.igfs.IgfsStreamsSelfTest.testCreateFileColocated(IgfsStreamsSelfTest.java:237) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1759) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:118) at org.apache.ignite.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1697) at java.lang.Thread.run(Thread.java:745) {code} The problem is that data cache rebalancing happens during the test, that causes data blocks affinity calculation to be not reproducible: block that initially maps to 0-th node after some time maps to a different node. Suggested fix is to switch off automatic rebelancing, and do the rebalancing manually only once upon the test beginning. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2966) Hadoop tests: in case of Hadop download failure do not delete all the install directory
Ivan Veselovsky created IGNITE-2966: --- Summary: Hadoop tests: in case of Hadop download failure do not delete all the install directory Key: IGNITE-2966 URL: https://issues.apache.org/jira/browse/IGNITE-2966 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Hadoop distribution is downloaded automatically for Hadoop tests. Several download URLs are tried. In case of failure download is cleaned up and next URL is tried. But due to bug in org.apache.ignite.testsuites.IgniteHadoopTestSuite , line 333 all the install directory is deleted, while only the component whose download is failed should be deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2938) IgfsBackupsDualAsyncSelfTest.testAppendParentMissing and IgfsBackupsDualAsyncSelfTest.testAppendParentMissingPartially fail sometimes on master
Ivan Veselovsky created IGNITE-2938: --- Summary: IgfsBackupsDualAsyncSelfTest.testAppendParentMissing and IgfsBackupsDualAsyncSelfTest.testAppendParentMissingPartially fail sometimes on master Key: IGNITE-2938 URL: https://issues.apache.org/jira/browse/IGNITE-2938 Project: Ignite Issue Type: Test Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Tests IgfsBackupsDualAsyncSelfTest.testAppendParentMissing IgfsBackupsDualAsyncSelfTest.testAppendParentMissingPartially fail from time to time on master -- need to investigate. It looks like that started to happen after fix https://issues.apache.org/jira/browse/IGNITE-1631 . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2859) IGFS: test org.apache.ignite.internal.processors.igfs.IgfsStartCacheTest#testCacheStart fakily fails
Ivan Veselovsky created IGNITE-2859: --- Summary: IGFS: test org.apache.ignite.internal.processors.igfs.IgfsStartCacheTest#testCacheStart fakily fails Key: IGNITE-2859 URL: https://issues.apache.org/jira/browse/IGNITE-2859 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Fix For: 1.6 {code} java.io.IOException: File was concurrently deleted: /test/test.file at org.apache.ignite.internal.processors.igfs.IgfsOutputStreamImpl.flush(IgfsOutputStreamImpl.java:278) at org.apache.ignite.internal.processors.igfs.IgfsOutputStreamAdapter.close(IgfsOutputStreamAdapter.java:182) at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320) at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149) at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233) at java.io.BufferedWriter.close(BufferedWriter.java:266) at org.apache.ignite.internal.processors.igfs.IgfsStartCacheTest.testCacheStart(IgfsStartCacheTest.java:127) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1758) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:118) at org.apache.ignite.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1696) at java.lang.Thread.run(Thread.java:745) {code} In method org.apache.ignite.internal.processors.igfs.IgfsOutputStreamImpl#flush() existence of the file being written is checked with id2InfoPrj.get(id) . For some reason it appears that sometimes this get returns null, while method org.apache.ignite.internal.processors.igfs.IgfsMetaManager#create guaranteed to return at this point, and the file creation transaction must already be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2808) IGFS: re-create lock relaxed version
Ivan Veselovsky created IGNITE-2808: --- Summary: IGFS: re-create lock relaxed version Key: IGNITE-2808 URL: https://issues.apache.org/jira/browse/IGNITE-2808 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Earlier we had IGFS implementation that seemed to be fast, but was not absolutely correct in terms of synchronization logic. As now we suspect some problems with IGFS performance, we need to create such version of the implementation again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2807) IGFS: re-create lock relaxed version
Ivan Veselovsky created IGNITE-2807: --- Summary: IGFS: re-create lock relaxed version Key: IGNITE-2807 URL: https://issues.apache.org/jira/browse/IGNITE-2807 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Earlier we had IGFS implementation that seemed to be fast, but was not absolutely correct in terms of synchronization logic. As now we suspect some problems with IGFS performance, we need to create such version of the implementation again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2806) IGFS: re-create lock relaxed version
Ivan Veselovsky created IGNITE-2806: --- Summary: IGFS: re-create lock relaxed version Key: IGNITE-2806 URL: https://issues.apache.org/jira/browse/IGNITE-2806 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.6 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Earlier we had IGFS implementation that seemed to be fast, but was not absolutely correct in terms of synchronization logic. As now we suspect some problems with IGFS performance, we need to create such version of the implementation again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2725) Assertion in org.apache.ignite.internal.processors.hadoop.jobtracker.HadoopJobTracker.CancelJobProcessor#update fails if a job failed with an exception
Ivan Veselovsky created IGNITE-2725: --- Summary: Assertion in org.apache.ignite.internal.processors.hadoop.jobtracker.HadoopJobTracker.CancelJobProcessor#update fails if a job failed with an exception Key: IGNITE-2725 URL: https://issues.apache.org/jira/browse/IGNITE-2725 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Assertion in org.apache.ignite.internal.processors.hadoop.jobtracker.HadoopJobTracker.CancelJobProcessor#update : 1584 failed if a Hadoop job failed with an exception. The problem is that assertion expects the phase to be PHASE_CANCELLING , while actual phase is PHASE_COMPLETE. {code} assert meta.phase() == PHASE_CANCELLING || err != null: "Invalid phase for cancel: " + meta; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2438) IGFS: in dual modes file modification time should be propagated from the underlying Fs.
Ivan Veselovsky created IGNITE-2438: --- Summary: IGFS: in dual modes file modification time should be propagated from the underlying Fs. Key: IGNITE-2438 URL: https://issues.apache.org/jira/browse/IGNITE-2438 Project: Ignite Issue Type: Bug Components: IGFS Affects Versions: 1.5 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.6 Currently IGFS in dual mode reports file modification time as the time when that file was created on IGFS (propagated from underlying file system). But should report exact copy of the underlying file system file modification time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2308) Should correctly detect dependencies on Hadoop classes
Ivan Veselovsky created IGNITE-2308: --- Summary: Should correctly detect dependencies on Hadoop classes Key: IGNITE-2308 URL: https://issues.apache.org/jira/browse/IGNITE-2308 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.5 While fixing IGNITE-2206 we revealed that dependencies on Hadoop code are not detected in all cases. In particular, if an Ignite class has a return type org.apache.hadoop.fs.FileSystem , that dependency is not detected by HadoopClassLoader. The reason is that org.objectweb.asm.ClassVisitor , MethodVisitor, etc. implementations are not full, they ignore many dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2218) Ignite nodes cannot run Hadoop native code (e.g. snappy)
Ivan Veselovsky created IGNITE-2218: --- Summary: Ignite nodes cannot run Hadoop native code (e.g. snappy) Key: IGNITE-2218 URL: https://issues.apache.org/jira/browse/IGNITE-2218 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Mapreduce tasks fail with the following exception: {code} class org.apache.ignite.IgniteCheckedException: class org.apache.ignite.IgniteCheckedException: native snappy library not available: this version of libhadoop was built without snappy support. at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:105) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2Task.run(HadoopV2Task.java:54) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:249) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:201) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0(HadoopRunnableTask.java:144) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:116) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext$1.run(HadoopV2TaskContext.java:550) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.runAsJobOwner(HadoopV2TaskContext.java:548) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:46) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:186) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193) at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:90) at com.splunk.mr.input.SplunkLineRecordReader.vixInitialize(SplunkLineRecordReader.java:17) at com.splunk.mr.input.BaseSplunkRecordReader.initialize(BaseSplunkRecordReader.java:95) at com.splunk.mr.JobSubmitterInputFormat.createRecordReader(JobSubmitterInputFormat.java:66) at com.splunk.mr.JobSubmitterInputFormat.createRecordReader(JobSubmitterInputFormat.java:21) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2MapTask.run0(HadoopV2MapTask.java:74) ... 16 more at org.apache.ignite.internal.processors.hadoop.HadoopUtils.transformException(HadoopUtils.java:290) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.run(HadoopV2TaskContext.java:255) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.runTask(HadoopRunnableTask.java:201) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call0(HadoopRunnableTask.java:144) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:116) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask$1.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext$1.run(HadoopV2TaskContext.java:550) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.ignite.internal.processors.hadoop.v2.HadoopV2TaskContext.runAsJobOwner(HadoopV2TaskContext.java:548) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:114) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopRunnableTask.call(HadoopRunnableTask.java:46) at org.apache.ignite.internal.processors.hadoop.taskexecutor.HadoopExecutorService$2.body(HadoopExecutorService.java:186) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) {code} This happens because class org.apache.hadoop.util.NativeCodeLoader substituted with org.apache.ignite.internal.processors.hadoop.v2.HadoopNativeCodeLoader . -- This
[jira] [Created] (IGNITE-2206) Make the file SecondaryFileSystemProvider pluggable
Ivan Veselovsky created IGNITE-2206: --- Summary: Make the file SecondaryFileSystemProvider pluggable Key: IGNITE-2206 URL: https://issues.apache.org/jira/browse/IGNITE-2206 Project: Ignite Issue Type: Sub-task Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1925) Test HadoopSkipListSelfTest.testLevel flakily fails
Ivan Veselovsky created IGNITE-1925: --- Summary: Test HadoopSkipListSelfTest.testLevel flakily fails Key: IGNITE-1925 URL: https://issues.apache.org/jira/browse/IGNITE-1925 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Priority: Minor Test HadoopSkipListSelfTest.testLevel fails from time to time with ~ 3% probability. junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:55) at junit.framework.Assert.assertTrue(Assert.java:22) at junit.framework.Assert.assertTrue(Assert.java:31) at junit.framework.TestCase.assertTrue(TestCase.java:201) at org.apache.ignite.internal.processors.hadoop.shuffle.collections.HadoopSkipListSelfTest.testLevel(HadoopSkipListSelfTest.java:83) --- Stdout: --- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1850) IGFS: implicitly created directoried should always have default properties
Ivan Veselovsky created IGNITE-1850: --- Summary: IGFS: implicitly created directoried should always have default properties Key: IGNITE-1850 URL: https://issues.apache.org/jira/browse/IGNITE-1850 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Currently we have #create, #append , #mkdirs operations that implicitly create parent directories if they are absent. Now #mkdirs uses the properties passed in for the implicitly created directories if they are not null, and uses default properties (with 0777 permission flag) if the properties are not given. #create & #append use for the implicitly created directories properties passed in for newly created file, if the passed in properties are not null, and use default properties (with 0777 permission flag) if they are not given. It would be more logical to always use defaults for the implicitly created directories. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1778) IGFS: implement rollback procedure: cleanup the "reserved" data.
Ivan Veselovsky created IGNITE-1778: --- Summary: IGFS: implement rollback procedure: cleanup the "reserved" data. Key: IGNITE-1778 URL: https://issues.apache.org/jira/browse/IGNITE-1778 Project: Ignite Issue Type: Sub-task Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky The following procedure is applied if the file is locked: 1) take Node id from the lock Id. 2) see via discovery service if this node is alive. 3) if yes, return (we cannot lock the file). 4) if not: do a rollback: - delete all the blocks in "reserved" range from the data cache. - set reserved range to zero. - remove the lock from the FileInfo. The above procedure should be performed upon every attempt to take a lock, and (may be) periodically while traversing the file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1777) IGFS: write files with fail-safe logic: "lock" -> "reserve space" -> "write" -> "size update, unlock"
Ivan Veselovsky created IGNITE-1777: --- Summary: IGFS: write files with fail-safe logic: "lock" -> "reserve space" -> "write" -> "size update, unlock" Key: IGNITE-1777 URL: https://issues.apache.org/jira/browse/IGNITE-1777 Project: Ignite Issue Type: Sub-task Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1740) IGFS: file write lock should contain node id
Ivan Veselovsky created IGNITE-1740: --- Summary: IGFS: file write lock should contain node id Key: IGNITE-1740 URL: https://issues.apache.org/jira/browse/IGNITE-1740 Project: Ignite Issue Type: Sub-task Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky item "3)" from IGNITE-1697. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1691) [Test] CacheStopAndDestroySelfTest.testDhtDoubleDestroy fails with NPE sometimes.
Ivan Veselovsky created IGNITE-1691: --- Summary: [Test] CacheStopAndDestroySelfTest.testDhtDoubleDestroy fails with NPE sometimes. Key: IGNITE-1691 URL: https://issues.apache.org/jira/browse/IGNITE-1691 Project: Ignite Issue Type: Test Reporter: Ivan Veselovsky The problem is that c.context().config() returns null after the cache was cleaned up. for (IgniteCacheProxy c : g0.context().cache().jcaches()) { CacheConfiguration cfg = c.context().config(); java.lang.NullPointerException: null at org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.awaitPartitionMapExchange(GridCommonAbstractTest.java:416) at org.apache.ignite.internal.processors.cache.CacheStopAndDestroySelfTest.checkDestroyed(CacheStopAndDestroySelfTest.java:754) at org.apache.ignite.internal.processors.cache.CacheStopAndDestroySelfTest.dhtDestroy(CacheStopAndDestroySelfTest.java:226) at org.apache.ignite.internal.processors.cache.CacheStopAndDestroySelfTest.testDhtDoubleDestroy(CacheStopAndDestroySelfTest.java:199) Hint: to achieve better reproducibility repeat dhtDestroy(); operation in the test not 2, but several tens times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1692) [Test] DataStreamProcessorSelfTest.testReplicated fails sometimes.
Ivan Veselovsky created IGNITE-1692: --- Summary: [Test] DataStreamProcessorSelfTest.testReplicated fails sometimes. Key: IGNITE-1692 URL: https://issues.apache.org/jira/browse/IGNITE-1692 Project: Ignite Issue Type: Test Reporter: Ivan Veselovsky DataStreamProcessorSelfTest.testReplicated fails with ~7% probability with the following error: {code} org.apache.ignite.IgniteCheckedException: Failed to find server node for cache (all affinity nodes have left the grid or cache was stopped): null at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:6979) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:166) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:115) at org.apache.ignite.internal.util.future.GridCompoundFuture$Listener.apply(GridCompoundFuture.java:311) at org.apache.ignite.internal.util.future.GridCompoundFuture$Listener.apply(GridCompoundFuture.java:302) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:262) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:250) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:380) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:346) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:334) at org.apache.ignite.testframework.GridTestUtils$5.run(GridTestUtils.java:675) at org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:966) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) Caused by: org.apache.ignite.cache.CacheServerNotFoundException: Failed to find server node for cache (all affinity nodes have left the grid or cache was stopped): null at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1604) at org.apache.ignite.internal.processors.cache.IgniteCacheFutureImpl.convertException(IgniteCacheFutureImpl.java:56) at org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:122) at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessorSelfTest$1.call(DataStreamProcessorSelfTest.java:232) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) Caused by: org.apache.ignite.internal.cluster.ClusterTopologyServerNotFoundException: Failed to find server node for cache (all affinity nodes have left the grid or cache was stopped): null at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.nodes(DataStreamerImpl.java:772) at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.load0(DataStreamerImpl.java:638) at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addDataInternal(DataStreamerImpl.java:547) at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addData(DataStreamerImpl.java:583) at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessorSelfTest$1.call(DataStreamProcessorSelfTest.java:226) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1690) [Test] IgniteCacheCreateRestartSelfTest.testStopOriginatingNode sometimes fails.
Ivan Veselovsky created IGNITE-1690: --- Summary: [Test] IgniteCacheCreateRestartSelfTest.testStopOriginatingNode sometimes fails. Key: IGNITE-1690 URL: https://issues.apache.org/jira/browse/IGNITE-1690 Project: Ignite Issue Type: Test Components: cache Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky IgniteCacheCreateRestartSelfTest.testStopOriginatingNode sometimes fails (~5% probability) due to inability to register a cache metrics MBean: org.apache.ignite.IgniteCheckedException: Failed to register MBean for component: org.apache.ignite.internal.processors.cache.CacheMetricsMXBeanImpl@3a5e46b6 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.ignite.internal.util.IgniteUtils.registerCacheMBean(IgniteUtils.java:4355) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.registerMbean(GridCacheProcessor.java:3267) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.createCache(GridCacheProcessor.java:1526) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:813) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:937) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1617) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1484) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:965) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:526) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:725) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:709) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:686) at org.apache.ignite.internal.processors.cache.IgniteCacheCreateRestartSelfTest.testStopOriginatingNode(IgniteCacheCreateRestartSelfTest.java:104) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1697) IGFS: implement reliable Igfs failover logic
Ivan Veselovsky created IGNITE-1697: --- Summary: IGFS: implement reliable Igfs failover logic Key: IGNITE-1697 URL: https://issues.apache.org/jira/browse/IGNITE-1697 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: 1.5 Problems to solve: 1) currently a write lock for a file may stay taken forever if a node have taken the lock and then crashed. 2) Currently the blocks of file content are written not just as dataCache.put() operations , but sent using ad-hoc async messages. This was done earlier to improve performance. But in order to implement reliable failover we need to get rid of that and use simple put() or asyncPut() cache operations. Solution plan: 1) use async put to write file data blocks. 2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit" -> "release lock". 3) The id of the node that locked a file should be readable from the lock id. 4) Upon taking a file lock the following procedure should be performed: if file is locked, take the node Id of the node that locked the file. After that ask DiscoveryProcessor if this node is alive. If it is not (node has left topology), perform cleanup procedure: delete all the data blocks of the reserved data range, then delete the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1699) Method org.apache.ignite.internal.util.GridJavaProcess#kill should be more reliable.
Ivan Veselovsky created IGNITE-1699: --- Summary: Method org.apache.ignite.internal.util.GridJavaProcess#kill should be more reliable. Key: IGNITE-1699 URL: https://issues.apache.org/jira/browse/IGNITE-1699 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Test IpcSharedMemoryCrashDetectionSelfTest.testIgfsServerClientInteractionsUponClientKilling sometimes fails with the nelow error. I see the following problems in this method: 1) exeption (if any) should better contain the actual process exit code if it was different from the expected one; 2) assertion on zero exit code ( org/apache/ignite/internal/util/GridJavaProcess.java:197) of the killing command is not always correct , e.g. on Unix systems killing of an unexisting process returns 1. So, if the process finished before kill was invoked, the method will fail with assertion error. {code} java.lang.AssertionError: Process killing was not successful at org.apache.ignite.internal.util.GridJavaProcess.kill(GridJavaProcess.java:197) at org.apache.ignite.internal.util.ipc.shmem.IpcSharedMemoryCrashDetectionSelfTest.interactWithClient(IpcSharedMemoryCrashDetectionSelfTest.java:343) at org.apache.ignite.internal.util.ipc.shmem.IpcSharedMemoryCrashDetectionSelfTest.testIgfsServerClientInteractionsUponClientKilling(IpcSharedMemoryCrashDetectionSelfTest.java:91) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1658) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:112) at org.apache.ignite.testframework.junits.GridAbstractTest$4.run(GridAbstractTest.java:1596) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1685) BrokenBarrierException in test o.a.i.i.processors.cache.distributed.IgniteCacheClientNodeChangingTopologyTest.testAtomicClockPutAllMultinode
Ivan Veselovsky created IGNITE-1685: --- Summary: BrokenBarrierException in test o.a.i.i.processors.cache.distributed.IgniteCacheClientNodeChangingTopologyTest.testAtomicClockPutAllMultinode Key: IGNITE-1685 URL: https://issues.apache.org/jira/browse/IGNITE-1685 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky BrokenBarrierException followed by a hang up observed in build http://94.72.60.102/viewLog.html?buildId=551330=Ignite_IgniteCache2=buildLog [16:48:10]W: [org.apache.ignite:ignite-core] java.util.concurrent.BrokenBarrierException [16:48:10]W: [org.apache.ignite:ignite-core]at java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:243) [16:48:10]W: [org.apache.ignite:ignite-core]at java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:355) [16:48:10]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.distributed.IgniteCacheClientNodeChangingTopologyTest$14.call(IgniteCacheClientNodeChangingTopologyTest.java:1553) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1656) Get rid of md5 and sha1
Ivan Veselovsky created IGNITE-1656: --- Summary: Get rid of md5 and sha1 Key: IGNITE-1656 URL: https://issues.apache.org/jira/browse/IGNITE-1656 Project: Ignite Issue Type: Bug Components: cache Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Fix For: 1.5 Description of the problem wrt sha1 is there: https://sites.google.com/site/itstheshappening/ . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1631) IGFS: append fails to create a new file in DUAL modes
Ivan Veselovsky created IGNITE-1631: --- Summary: IGFS: append fails to create a new file in DUAL modes Key: IGNITE-1631 URL: https://issues.apache.org/jira/browse/IGNITE-1631 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: ignite-1.5 An attempt to create a new file using IGFS#append() method with "create" flag being 'true' constantly fails with the below exception. Fix that and cover with tests. Caused by: class org.apache.ignite.IgniteCheckedException: Failed to append to the file due to secondary file system exception: /dir/subdir/file2 at org.apache.ignite.internal.processors.igfs.IgfsMetaManager$6.onFailure(IgfsMetaManager.java:2332) at org.apache.ignite.internal.processors.igfs.IgfsMetaManager$6.onFailure(IgfsMetaManager.java:2284) at org.apache.ignite.internal.processors.igfs.IgfsMetaManager.synchronizeAndExecute(IgfsMetaManager.java:3065) at org.apache.ignite.internal.processors.igfs.IgfsMetaManager.synchronizeAndExecute(IgfsMetaManager.java:2860) at org.apache.ignite.internal.processors.igfs.IgfsMetaManager.appendDual(IgfsMetaManager.java:2337) at org.apache.ignite.internal.processors.igfs.IgfsImpl$16.call(IgfsImpl.java:1119) at org.apache.ignite.internal.processors.igfs.IgfsImpl$16.call(IgfsImpl.java:1104) at org.apache.ignite.internal.processors.igfs.IgfsImpl.safeOp(IgfsImpl.java:2014) ... 11 more Caused by: class org.apache.ignite.IgniteCheckedException: Failed to create path locally due to secondary file system exception: /dir at org.apache.ignite.internal.processors.igfs.IgfsMetaManager.synchronize(IgfsMetaManager.java:2814) at org.apache.ignite.internal.processors.igfs.IgfsMetaManager.synchronizeAndExecute(IgfsMetaManager.java:3018) ... 16 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1590) IGFS: update & create operations should be truly thread-safe.
Ivan Veselovsky created IGNITE-1590: --- Summary: IGFS: update & create operations should be truly thread-safe. Key: IGNITE-1590 URL: https://issues.apache.org/jira/browse/IGNITE-1590 Project: Ignite Issue Type: Bug Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: ignite-1.5 Fix #create() operation & revise #update(). Currently the tests with many concurrent operations (involving delete/rename/nkdirs/create) reveal concurrency problems -- some tree entries get "lost". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1566) Hadoop: In case if IGFS name is missing in the URI, the connection failure message should be more informative.
Ivan Veselovsky created IGNITE-1566: --- Summary: Hadoop: In case if IGFS name is missing in the URI, the connection failure message should be more informative. Key: IGNITE-1566 URL: https://issues.apache.org/jira/browse/IGNITE-1566 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky User has IGFS "igfs" configured in Ignite node. After that he tries to connect to it with hadoop client using command {code} $ hadoop fs -ls igfs://127.0.0.1:10500/ {code} And gets the following error message: {code}ls: Failed to communicate with IGFS.{code} . The problem is that IGFS name is missing in the URI. But the error message does not give any hint about that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1573) IGFS: mkdirs operations should be truly thread-safe
Ivan Veselovsky created IGNITE-1573: --- Summary: IGFS: mkdirs operations should be truly thread-safe Key: IGNITE-1573 URL: https://issues.apache.org/jira/browse/IGNITE-1573 Project: Ignite Issue Type: Bug Components: hadoop Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: ignite-1.5 Need to lock all the existing parents and try to add the missing path in one transaction. If failed due to concurrent creation, retry from the beginning. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1541) IGFS: mkdirs operation should be truly thread-safe.
Ivan Veselovsky created IGNITE-1541: --- Summary: IGFS: mkdirs operation should be truly thread-safe. Key: IGNITE-1541 URL: https://issues.apache.org/jira/browse/IGNITE-1541 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky In IGNITE-1515 we made delete thread-safe. But experience shows that concurrent delete + mkdirs operations nevertheless cause file system structure corruption, see test org.apache.ignite.internal.processors.igfs.IgfsAbstractSelfTest#testDeadlocksDeleteMkdirs . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1515) IGFS: delete operations should be truly thread-safe
Ivan Veselovsky created IGNITE-1515: --- Summary: IGFS: delete operations should be truly thread-safe Key: IGNITE-1515 URL: https://issues.apache.org/jira/browse/IGNITE-1515 Project: Ignite Issue Type: Bug Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky In IGNITE-586 we fixed concurrency problem in rename/move operations. Since delete is actually also remove, we should fix deletion operations also accordingly using similar logic. IgfsMetaManager#move0(2) method is to be used partially in delete operation -- need to refactor code accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-1299) IGFS: meta update file unlock transaction should be retried.
Ivan Veselovsky created IGNITE-1299: --- Summary: IGFS: meta update file unlock transaction should be retried. Key: IGNITE-1299 URL: https://issues.apache.org/jira/browse/IGNITE-1299 Project: Ignite Issue Type: Bug Components: general Affects Versions: ignite-1.4 Reporter: Ivan Veselovsky Assignee: Ivan Veselovsky Fix For: ignite-1.4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)