[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108858#comment-13108858
]
Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------
Actually, from what i am reading, even java.io.tmpdir may be shared between
different tasks even in distributed mode in case jvm sharing is enabled (forget
the property name, but i use it also at value of ~5-10 to speed up small tasks
setup).
In which case using java.io.tmp/q-temp may potentially be a problem as well.
+What i propose is it probably would be safer to use a directory
*$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but
perhaps not with local tests, if it is not unique with local tests, i will add
some random numbers).+
{panel}
A little background: having to write a temporary file in this task is a corner
case only arising when Q block height is smaller than the number of input rows
of A coming in, which should never be the case with normal block sizes but may
be a case with minSplitSize splits set at 1G or something, or if A input is
extremely sparse (such as one non-zero element per row on average, then yeah, Q
blocks, which are k+p wide (the number of eigenvalues requested), which is not
a very good use case for this method, i'd rather try to transpose first to see
if it helps row-wise sparsity).
The test however is set up intentionally the way that Q block height is set
extremely small to test both blocking within a split and among the splits.
{panel}
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
> Key: MAHOUT-814
> URL: https://issues.apache.org/jira/browse/MAHOUT-814
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.5
> Reporter: Grant Ingersoll
> Assignee: Dmitriy Lyubimov
> Priority: Minor
> Fix For: 0.6
>
> Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
> at java.io.FileOutputStream.open(Native Method)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
> at
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
> at
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
> at
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
> at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
> at
> org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error:
>
> testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest):
> Q job unsuccessful.
>
> testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest):
> Q job unsuccessful.
>
> testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest):
> Q job unsuccessful.
>
> testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest):
> Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same
> reason. We should dynamically create a temp area for each test using
> temporary directories under the main temp dir.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira