[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108303#comment-13108303
]
Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------
Ok, I actually don't know what the best way to fix this.
Hadoop boxes tasks so that java.io.tmpdir points to the task's temporary
directory in the mapred folder (in distributed mode). That's basically their
contract for task temporary space.
My idea was to hack that in the case of local mapred to point something more
suitable with mahout.
But that more suitable turns out to be... /tmp+something as in the following
code:
{code:title=MahoutTestCase.java}
protected final File getTestTempDir() throws IOException {
if (testTempDir == null) {
String systemTmpDir = System.getProperty("java.io.tmpdir");
long simpleRandomLong = (long) (Long.MAX_VALUE * Math.random());
testTempDir = new File(systemTmpDir, "mahout-" +
getClass().getSimpleName() + '-' + simpleRandomLong);
if (!testTempDir.mkdir()) {
throw new IOException("Could not create " + testTempDir);
}
testTempDir.deleteOnExit();
}
return testTempDir;
}
{code}
So... it looks like Mahout's test framework is already hooked on that (which in
its turn is deregulated by Mahout, so it points to /tmp perhaps in test mode
already).
So it looks like i cannot override java.io.tmpdir Mahout-wide because Mahout
already attributes some meaning to that variable.
I don't immediately see the best solution here.
i can probably change the solvers so that they don't necessarily write to the
task's root folder but create another folder there, but that still doesn't
guarantee absence of clashes during tests (because only getTestTempDir() would
guarantee that).
So i would want to solicit some discussion here.
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
> Key: MAHOUT-814
> URL: https://issues.apache.org/jira/browse/MAHOUT-814
> Project: Mahout
> Issue Type: Bug
> Reporter: Grant Ingersoll
> Assignee: Dmitriy Lyubimov
> Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
> at java.io.FileOutputStream.open(Native Method)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
> at
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
> at
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
> at
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
> at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error:
>
> testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest):
> Q job unsuccessful.
>
> testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest):
> Q job unsuccessful.
>
> testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest):
> Q job unsuccessful.
>
> testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest):
> Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same
> reason. We should dynamically create a temp area for each test using
> temporary directories under the main temp dir.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira