Things like that question make me more suspicious. 

We really need to get a handle on the Hadoop version question.

I have run:

spark-itemsimilarity on Hadoop 1.2.1, 2.6.0 (fails), Andy ran it successfully 
on 2.2 and a user runs it on 2.4-MapR
2.6.0 seems to find the local file system with these lines:
  val conf = new Configuration()
  val fs = FileSystem.get(conf)
On the earlier versions of Hadoop, it finds the cluster, or pseudo cluster HDFS

I’ve run Any’s 20 new groups classifier test script on hadoop 1.2.1 with a 
classdef mismatch error, that probably means I built wrong. I’ll be testing 
that again Monday.

i’m building a 2.2.0 pseudo cluster and will run 20 news groups and 
spark-itemsimilairty Monday

I guess the big question is still 2.5 or 2.6 does anyone know why the two lines 
above would cause a problem in recent Hadoop versions? Does someone have a 
known good 2.6 cluster that they can try a couple tests on?


On Apr 5, 2015, at 9:52 AM, Andrew Musselman <[email protected]> wrote:

I wonder if that HDFS/FS issue is the same problem I have with
cluster-reuters.sh.

On Sunday, April 5, 2015, Pat Ferrel <[email protected]> wrote:

> Very few of these are on the “official” ticket list here:
> 
> https://issues.apache.org/jira/browse/MAHOUT-1648?filter=-4&jql=project%20%3D%20MAHOUT%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20BY%20createdDate%20DESC
> 
> M-1674
> M-1665
> M-1648
> 
> The next time this is published it would be great to get versions of
> Hadoop people are using and what has actually been run on a cluster or
> pseudo cluster, under yarn etc. I’m increasingly suspicious that we don’t
> run uniformly on Hadoop 2.5-2.6 but have no hard evidence. I’ve failed on
> H2.6.0 but may not have an airtight configuration. If anyone has this
> config woking I can supply a very simple test.
> 
> The failure happens when an HDFS path gets applied to the raw local
> filesystem, even though hadoop 2.6 HDFS is running and MAHOUT-LOCAL is
> unset. The root of the error I’ve seen is in getting the FileSystem, which
> always returns the local one.
> 
> 
> M-1674 is new and was found on Friday. Dmitriy already has a private fix
> but can’t commit it so I think we need a workaround.
> 
> On Apr 4, 2015, at 8:46 PM, Suneel Marthi <[email protected]
> <javascript:;>> wrote:
> 
> Saturday(2 days before code freeze). The code freeze's gonna be on Monday -
> April 6.  Please address ur assigned JIRAs on time.
> 
> Anand Avati
> -------------------------
> 
> M-1622: Multithreaded batch Item similarities output incorrect similarities
> M-1605: Make Visualizer test locale independent
> 
> Andrew Palumbo
> --------------------------
> M-1559: Add documentation for Wikipedia example
> M-1648: Update CMS for Mahout 0.10.0
> 
> Andrew Musselman
> -----------------------------
> M-1462: Cleaning up Random Forests documentation on Mahout website
> M-1470: LDA Topic dump
> M-1655: Refactor module dependencies
> M-1658: KMeans fails when run on Hadoop clusters
> 
> Frank Scholten
> ---------------------
> M-1625: lucene2seq: failure to convert a document that does not contain a
> field (the field is not required)
> M-1633: Failure to execute query when solr index contains documents with
> different fields
> M-1649: Lucene 5 upgrade
> 
> Pat Ferrel
> -----------------
> M-1507: Support input and output using user defined ID wherever possible
> M-1588: Multiple Input path support in Recommenders
> 
> Stevo Slavic
> --------------------
> M-1277: Lose dependency on custom commons-cli
> M-1278: Improve inheritance of apache parent pom
> M-1562: Publish Scaladocs
> M-1585: Javadocs are not hosted By Mahout Quality
> M-1650: upgrade 3rd party jars
> 
> Suneel Marthi
> ---------------------
> M-1469: Streaming KMeans fails when executed in MR mode and
> REDUCE_STREAMING_KMEANS set to true
> M-1512: Hadoop 2 compatibility
> M-1652: Java 7 update
> M-1630: Incorrect SparseMatrix.numColSlices() causes IllegalStateException
> 
> Ted Dunning
> -------------------
> 
> M-1672: TDigest update to 3.1 in OnlineSummarizers
> 
> Unassigned
> ------------------
> M-1551: Add document to describe how to use mlp with command line    (Patch
> available)
> M-1637: RecommenderJob of ALS fails in the mapper because it uses the
> instance of other classs
> 
> 

Reply via email to