[jira] [Commented] (HAMA-940) Add StreamInputFormat

2016-05-03 Thread Behroz Sikander (JIRA)

[ 
https://issues.apache.org/jira/browse/HAMA-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268267#comment-15268267
 ] 

Behroz Sikander commented on HAMA-940:
--

Your idea seems to be workable. I will study the InputFormat interface and will 
update you soon. 

> Add StreamInputFormat
> -
>
> Key: HAMA-940
> URL: https://issues.apache.org/jira/browse/HAMA-940
> Project: Hama
>  Issue Type: New Feature
>  Components: bsp core
>Reporter: Edward J. Yoon
>
> Add StreamInputFormat that reads newly appended records from previous 
> superstep. 
> I roughly guess it will be possible using reopen() method and file offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAMA-940) Add StreamInputFormat

2016-05-01 Thread Edward J. Yoon (JIRA)

[ 
https://issues.apache.org/jira/browse/HAMA-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15266013#comment-15266013
 ] 

Edward J. Yoon commented on HAMA-940:
-

If we can hide these implmentations and simplified APIs for processing stream 
data, I think this way is the better.

> Add StreamInputFormat
> -
>
> Key: HAMA-940
> URL: https://issues.apache.org/jira/browse/HAMA-940
> Project: Hama
>  Issue Type: New Feature
>  Components: bsp core
>Reporter: Edward J. Yoon
>
> Add StreamInputFormat that reads newly appended records from previous 
> superstep. 
> I roughly guess it will be possible using reopen() method and file offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAMA-940) Add StreamInputFormat

2016-05-01 Thread Edward J. Yoon (JIRA)

[ 
https://issues.apache.org/jira/browse/HAMA-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15266012#comment-15266012
 ] 

Edward J. Yoon commented on HAMA-940:
-

As I mentioned in Description, we can simply check whether there's an newly 
appended records to the input file, keeping last read offset. 

To implement this, first of all, you should see the InputFormat interface 
class. The tricky issue is how we implement the getSplits() method and multiple 
tasks. 

At the moment, my simple idea is that one bsp task acts as a "Stream input 
queue" without implement StreamInputFormat and change the framework core. For 
example, we set the file path in job configuration. The master task acts like 
below: 

{code}
if(isMaster(peer.me)) {
  while(true) {
 peer.reopen(); // reopen
 peer.skip(offset); // jump to last offset
 if(peer.readNext()) {
 // at here we do load-balance.
sendTo("send a newly appended record to free slave tasks");
 } else {
Thread.sleep();
 }
  }
}
{code}



> Add StreamInputFormat
> -
>
> Key: HAMA-940
> URL: https://issues.apache.org/jira/browse/HAMA-940
> Project: Hama
>  Issue Type: New Feature
>  Components: bsp core
>Reporter: Edward J. Yoon
>
> Add StreamInputFormat that reads newly appended records from previous 
> superstep. 
> I roughly guess it will be possible using reopen() method and file offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAMA-940) Add StreamInputFormat

2016-04-28 Thread Behroz Sikander (JIRA)

[ 
https://issues.apache.org/jira/browse/HAMA-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263201#comment-15263201
 ] 

Behroz Sikander commented on HAMA-940:
--

[~udanax] This seems to be an interesting feature. Can you add some more 
details to this ?

> Add StreamInputFormat
> -
>
> Key: HAMA-940
> URL: https://issues.apache.org/jira/browse/HAMA-940
> Project: Hama
>  Issue Type: New Feature
>  Components: bsp core
>Reporter: Edward J. Yoon
>
> Add StreamInputFormat that reads newly appended records from previous 
> superstep. 
> I roughly guess it will be possible using reopen() method and file offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)