Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

2014-03-04 Thread Margusja
Bytes Written=2483042
Exception in thread main java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:113)
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:89)
at 
org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:294)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:228)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:188)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:252)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

I even downloaded source from https://github.com/apache/mahout.git and 
build it like:

mvn -DskipTests -Dhadoop2.version=2.2.0 clean install
then used command line:
/usr/lib/hadoop-yarn/bin/yarn jar 
mahout/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest -d 
input/data666.noheader.data -ds input/data666.noheader.data.info -sl 5 
-p -t 100 -o nsl-forest


and got the same error like above.

Is there something wrong in my side or hadoop-2.2.0 and mahout can not 
play each other anymore?


The typical example:
/usr/lib/hadoop-yarn/bin/yarn jar 
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar pi 
2 5

works.

--
Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-



Re: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

2014-03-04 Thread Margusja

Hi thanks for reply.

Here is my output:

[hduser@vm38 ~]$ /usr/lib/hadoop/bin/hadoop version Hadoop 2.2.0.2.0.6.0-101
Subversion g...@github.com:hortonworks/hadoop.git -r 
b07b2906c36defd389c8b5bd22bebc1bead8115b

Compiled by jenkins on 2014-01-09T05:18Z
Compiled with protoc 2.5.0
From source with checksum 704f1e463ebc4fb89353011407e965
This command was run using 
/usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-101.jar


[hduser@vm38 ~]$ /usr/lib/hadoop/bin/hadoop jar 
mahout/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest -d 
input/data666.noheader.data -ds input/data666.noheader.data.info -sl 5 
-p -t 100 -o nsl-forest


...
14/03/04 16:22:51 INFO mapreduce.Job:  map 0% reduce 0%
14/03/04 16:23:12 INFO mapreduce.Job:  map 100% reduce 0%
14/03/04 16:23:43 INFO mapreduce.Job: Job job_1393936067845_0013 
completed successfully

14/03/04 16:23:44 INFO mapreduce.Job: Counters: 27
File System Counters
FILE: Number of bytes read=2994
FILE: Number of bytes written=80677
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=880103
HDFS: Number of bytes written=2436546
HDFS: Number of read operations=5
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=45253
Total time spent by all reduces in occupied slots (ms)=0
Map-Reduce Framework
Map input records=9994
Map output records=100
Input split bytes=123
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=456
CPU time spent (ms)=36010
Physical memory (bytes) snapshot=180752384
Virtual memory (bytes) snapshot=994275328
Total committed heap usage (bytes)=101187584
File Input Format Counters
Bytes Read=879980
File Output Format Counters
Bytes Written=2436546
Exception in thread main java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:113)
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:89)
at 
org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:294)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:228)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:188)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:252)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-

On 04/03/14 16:11, Sergey Svinarchuk wrote:

Sory, I didn't see that you try use mahout-1.0-snapshot.
You used /usr/lib/hadoop-yarn/bin/yarn but need use
/usr/lib/hadoop/bin/hadoop and then your example will be success.


On Tue, Mar 4, 2014 at 3:45 PM, Sergey Svinarchuk 
ssvinarc...@hortonworks.com wrote:


Mahout 0.9 not supported hadoop 2 dependencies.
You can use mahout-1.0-SNAPSHOT or add to your mahout patch from
https://issues.apache.org/jira/browse/MAHOUT-1329 for added hadoop 2
support.


On Tue, Mar 4, 2014 at 3:38 PM, Margusja mar...@roo.ee wrote:


Hi

following command:
/usr/lib/hadoop-yarn/bin/yarn jar 
mahout-distribution-0.9/mahout-examples-0.9.jar
org.apache.mahout.classifier.df.mapreduce.BuildForest -d
input/data666.noheader.data -ds input/data666.noheader.data.info -sl 5
-p -t 100 -o nsl-forest

When I used hadoop 1.x then it worked.
Now I use hadoop-2.2.0 it gives me:
14/03/04 15:25:58 INFO mapreduce.BuildForest

Re: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

2014-03-05 Thread Margusja
 mapreduce.Job: Job job_1393936067845_0018 
completed successfully

14/03/05 10:27:49 INFO mapreduce.Job: Counters: 27
File System Counters
FILE: Number of bytes read=2994
FILE: Number of bytes written=80677
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=880103
HDFS: Number of bytes written=2446794
HDFS: Number of read operations=5
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=36022
Total time spent by all reduces in occupied slots (ms)=0
Map-Reduce Framework
Map input records=9994
Map output records=100
Input split bytes=123
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=402
CPU time spent (ms)=32020
Physical memory (bytes) snapshot=200962048
Virtual memory (bytes) snapshot=997363712
Total committed heap usage (bytes)=111673344
File Input Format Counters
Bytes Read=879980
File Output Format Counters
Bytes Written=2446794
Exception in thread main java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:113)
at 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:89)
at 
org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:294)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:228)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:188)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:252)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
[hduser@vm38 ~]$

Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-

On 05/03/14 01:18, Gokhan Capan wrote:

mvn clean package -DskipTests=true -Dhadoop2.version=2.2.0




Re: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

2014-03-08 Thread Margusja

Hi, is there any information about the problem I submitted?

Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-

On 05/03/14 10:30, Margusja wrote:

Hi

Here are my actions and the problematic result again:

[hduser@vm38 ~]$ git clone https://github.com/apache/mahout.git
remote: Reusing existing pack: 76099, done.
remote: Counting objects: 39, done.
remote: Compressing objects: 100% (32/32), done.
remote: Total 76138 (delta 2), reused 0 (delta 0)
Receiving objects: 100% (76138/76138), 49.04 MiB | 275 KiB/s, done.
Resolving deltas: 100% (34449/34449), done.
[hduser@vm38 ~]$ cd mahout
[hduser@vm38 ~]$ mvn clean package -DskipTests=true 
-Dhadoop2.version=2.2.0

...
...
...
[INFO] Reactor Summary:
[INFO]
[INFO] Mahout Build Tools  SUCCESS 
[15.529s]
[INFO] Apache Mahout . SUCCESS 
[1.657s]
[INFO] Mahout Math ... SUCCESS 
[1:00.891s]
[INFO] Mahout Core ... SUCCESS 
[2:44.617s]
[INFO] Mahout Integration  SUCCESS 
[38.195s]
[INFO] Mahout Examples ... SUCCESS 
[45.458s]
[INFO] Mahout Release Package  SUCCESS 
[0.012s]
[INFO] Mahout Math/Scala wrappers  SUCCESS 
[53.519s]
[INFO] 


[INFO] BUILD SUCCESS
[INFO] 


[INFO] Total time: 6:27.763s
[INFO] Finished at: Wed Mar 05 10:22:51 EET 2014
[INFO] Final Memory: 57M/442M
[INFO] 


[hduser@vm38 mahout]$
[hduser@vm38 mahout]$ cd ../
[hduser@vm38 ~]$ /usr/lib/hadoop/bin/hadoop jar 
mahout/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest -d 
input/data666.noheader.data -ds input/data666.noheader.data.info -sl 5 
-p -t 100 -o nsl-forest
14/03/05 10:26:39 INFO mapreduce.BuildForest: Partial Mapred 
implementation

14/03/05 10:26:39 INFO mapreduce.BuildForest: Building the forest...
14/03/05 10:26:39 INFO client.RMProxy: Connecting to ResourceManager 
at /0.0.0.0:8032
14/03/05 10:26:51 INFO input.FileInputFormat: Total input paths to 
process : 1

14/03/05 10:26:51 INFO mapreduce.JobSubmitter: number of splits:1
14/03/05 10:26:51 INFO Configuration.deprecation: user.name is 
deprecated. Instead, use mapreduce.job.user.name
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.jar is 
deprecated. Instead, use mapreduce.job.jar
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapred.cache.files.filesizes is deprecated. Instead, use 
mapreduce.job.cache.files.filesizes
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.cache.files 
is deprecated. Instead, use mapreduce.job.cache.files
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.reduce.tasks 
is deprecated. Instead, use mapreduce.job.reduces
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapred.output.value.class is deprecated. Instead, use 
mapreduce.job.output.value.class
14/03/05 10:26:51 INFO Configuration.deprecation: mapreduce.map.class 
is deprecated. Instead, use mapreduce.job.map.class
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.job.name is 
deprecated. Instead, use mapreduce.job.name
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapreduce.inputformat.class is deprecated. Instead, use 
mapreduce.job.inputformat.class
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapreduce.outputformat.class is deprecated. Instead, use 
mapreduce.job.outputformat.class
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.map.tasks is 
deprecated. Instead, use mapreduce.job.maps
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapred.cache.files.timestamps is deprecated. Instead, use 
mapreduce.job.cache.files.timestamps
14/03/05 10:26:51 INFO Configuration.deprecation: 
mapred.output.key.class is deprecated. Instead, use 
mapreduce.job.output.key.class
14/03/05 10:26:51 INFO Configuration.deprecation: mapred.working.dir 
is deprecated. Instead, use mapreduce.job.working.dir
14/03/05 10:26:52 INFO mapreduce.JobSubmitter: Submitting tokens for 
job: job_1393936067845_0018
14/03/05 10:26

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

2014-03-17 Thread Margusja
/mahout/examples/target/dependency/commons-digester-1.8.jar:/home/speech/mahout/examples/target/dependency/commons-el-1.0.jar:/home/speech/mahout/examples/target/dependency/commons-httpclient-3.0.1.jar:/home/speech/mahout/examples/target/dependency/commons-io-2.4.jar:/home/speech/mahout/examples/target/dependency/commons-lang-2.4.jar:/home/speech/mahout/examples/target/dependency/commons-lang3-3.1.jar:/home/speech/mahout/examples/target/dependency/commons-logging-1.1.3.jar:/home/speech/mahout/examples/target/dependency/commons-math-2.1.jar:/home/speech/mahout/examples/target/dependency/commons-math3-3.2.jar:/home/speech/mahout/examples/target/dependency/commons-net-1.4.1.jar:/home/speech/mahout/examples/target/dependency/easymock-3.2.jar:/home/speech/mahout/examples/target/dependency/guava-16.0.jar:/home/speech/mahout/examples/target/dependency/hadoop-core-1.2.1.jar:/home/speech/mahout/examples/target/dependency/hamcrest-core-1.3.jar:/home/speech/mahout/examples/target/dependency/icu4j-49.1.jar:/home/speech/mahout/examples/target/dependency/jackson-core-asl-1.9.12.jar:/home/speech/mahout/examples/target/dependency/jackson-jaxrs-1.7.1.jar:/home/speech/mahout/examples/target/dependency/jackson-mapper-asl-1.9.12.jar:/home/speech/mahout/examples/target/dependency/jackson-xc-1.7.1.jar:/home/speech/mahout/examples/target/dependency/jakarta-regexp-1.4.jar:/home/speech/mahout/examples/target/dependency/jaxb-api-2.2.2.jar:/home/speech/mahout/examples/target/dependency/jaxb-impl-2.2.3-1.jar:/home/speech/mahout/examples/target/dependency/jcl-over-slf4j-1.7.5.jar:/home/speech/mahout/examples/target/dependency/jersey-core-1.8.jar:/home/speech/mahout/examples/target/dependency/jersey-json-1.8.jar:/home/speech/mahout/examples/target/dependency/jersey-server-1.8.jar:/home/speech/mahout/examples/target/dependency/jettison-1.1.jar:/home/speech/mahout/examples/target/dependency/junit-4.11.jar:/home/speech/mahout/examples/target/dependency/log4j-1.2.17.jar:/home/speech/mahout/examples/target/dependency/lucene-analyzers-common-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-benchmark-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-core-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-facet-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-highlighter-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-memory-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-queries-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-queryparser-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-sandbox-4.6.1.jar:/home/speech/mahout/examples/target/dependency/lucene-spatial-4.6.1.jar:/home/speech/mahout/examples/target/dependency/mahout-core-1.0-SNAPSHOT.jar:/home/speech/mahout/examples/target/dependency/mahout-core-1.0-SNAPSHOT-tests.jar:/home/speech/mahout/examples/target/dependency/mahout-integration-1.0-SNAPSHOT.jar:/home/speech/mahout/examples/target/dependency/mahout-math-1.0-SNAPSHOT.jar:/home/speech/mahout/examples/target/dependency/mahout-math-1.0-SNAPSHOT-tests.jar:/home/speech/mahout/examples/target/dependency/nekohtml-1.9.17.jar:/home/speech/mahout/examples/target/dependency/objenesis-1.3.jar:/home/speech/mahout/examples/target/dependency/randomizedtesting-runner-2.0.15.jar:/home/speech/mahout/examples/target/dependency/slf4j-api-1.7.5.jar:/home/speech/mahout/examples/target/dependency/slf4j-log4j12-1.7.5.jar:/home/speech/mahout/examples/target/dependency/solr-commons-csv-3.5.0.jar:/home/speech/mahout/examples/target/dependency/spatial4j-0.3.jar:/home/speech/mahout/examples/target/dependency/stax-api-1.0.1.jar:/home/speech/mahout/examples/target/dependency/stax-api-1.0-2.jar:/home/speech/mahout/examples/target/dependency/t-digest-2.0.2.jar:/home/speech/mahout/examples/target/dependency/xercesImpl-2.9.1.jar:/home/speech/mahout/examples/target/dependency/xmlpull-1.1.3.1.jar:/home/speech/mahout/examples/target/dependency/xpp3_min-1.1.4c.jar:/home/speech/mahout/examples/target/dependency/xstream-1.4.4.jar

Mahout-1.0 from https://github.com/apache/mahout compiled as:

mvn -DskipTests clean install

[speech@h14 ~]$ hadoop version
Hadoop 2.2.0.2.0.6.0-101
Subversion g...@github.com:hortonworks/hadoop.git -r 
b07b2906c36defd389c8b5bd22bebc1bead8115b
Compiled by jenkins on 2014-01-09T05:18Z
Compiled with protoc 2.5.0
From source with checksum 704f1e463ebc4fb89353011407e965
This command was run using /usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-101.jar
[speech@h14 ~]$

Any hint what I am doing wrong?



--
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB

Command line vector to sequence file

2014-03-18 Thread Margusja

Hi

I am looking a simple way in a command line how to convert vector to 
sequence file.

in example I have data.txt file contains vectors.
1,1
2,1
1,2
2,2
3,3
8,8
8,9
9,8
9,9

So is there command line possibility to convert that into sequence file?

I tried mahout seqdirectory but after it  hdfs dfs -text 
output2/part-m-0 gives me something like:

/data.txt1,1
2,1
1,2
2,2
3,3
8,8
8,9
9,8
9,9

and that is not sequence file format as I understand.

I know there are java API but I am looking command line.


--
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-



Re: Command line vector to sequence file

2014-03-18 Thread Margusja

Thank you, I am going to try it.

Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-

On 18/03/14 10:58, Kevin Moulart wrote:

Hi,

I did the same search a few weeks back and found that there is nothing in
the current API to do that from command line.

However I did write a java program that transforms a csv into a
SequenceFile which can be used to train a naive bayes (amongst other
things).

Here are the sources :
https://gist.github.com/kmoulart/9616125

You'll find all you need to make a jar with dependecies running and with a
proper command line (using JCommander).
Both the sequential version and the MapReduce one are in the given files.

If you're lazy, I'll put the whole maven project on my github later today.

Hope it helps you

Kévin Moulart


2014-03-18 9:41 GMT+01:00 Margusja mar...@roo.ee:


Hi

I am looking a simple way in a command line how to convert vector to
sequence file.
in example I have data.txt file contains vectors.
1,1
2,1
1,2
2,2
3,3
8,8
8,9
9,8
9,9

So is there command line possibility to convert that into sequence file?

I tried mahout seqdirectory but after it  hdfs dfs -text
output2/part-m-0 gives me something like:
/data.txt1,1
2,1
1,2
2,2
3,3
8,8
8,9
9,8
9,9

and that is not sequence file format as I understand.

I know there are java API but I am looking command line.


--
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-