[ 
https://issues.apache.org/jira/browse/HAMA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Jungblut updated HAMA-493:
---------------------------------

    Attachment: HAMA-493.patch


Here is the example usage: 

{noformat}
~$ /usr/local/hama/bin/hama jar 
/usr/local/hama/hama-examples-0.4.0-incubating-SNAPSHOT.jar pagerank-text2seq 
/tmp/test_seq/in.txt hdfs://localhost:9000/tmp/test_seq/out.seq
12/01/27 17:33:55 INFO util.TextToSequenceFile: Processing file : 
file:/tmp/test_seq/in.txt
12/01/27 17:33:55 INFO util.TextToSequenceFile: Written 246 to 
hdfs://localhost:9000/tmp/test_seq/out.seq/in.txt.seq
{noformat}

Then you can run pagerank on it:

{noformat}
~$ /usr/local/hama/bin/hama jar 
/usr/local/hama/hama-examples-0.4.0-incubating-SNAPSHOT.jar pagerank 
/tmp/test_seq/out.seq/ /tmp/test_seq/out/
{noformat}

Similar it is working with SSSP.
In both, you can customize a separator string that is delimiting the records.
Play arround a bit with it. It also allows people to use regex'es in their 
paths and is able to transform multiple text files into sequencefiles.

BTW, we should delete the partition in the input directory once it has run, 
otherwise the user gets "Not a file" errors when rerunning the job.
Didn't we have a cleanup issue for that?
I added a remove part to the partition-dir in the FileInputFormat. Please 
review this, and say if you feel okay with this solution.

And just another thing, once a task has thrown an exception, we should kill the 
whole job. It is just hanging to infinity because the task doesn't report back 
to the groom?

However I should add testcases for it this patch. And document the public 
methods.
                
> Provide text to seq-file utils for graph examples
> -------------------------------------------------
>
>                 Key: HAMA-493
>                 URL: https://issues.apache.org/jira/browse/HAMA-493
>             Project: Hama
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>             Fix For: 0.4.0
>
>         Attachments: HAMA-493.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to