[
https://issues.apache.org/jira/browse/SAMOA-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15341399#comment-15341399
]
ASF GitHub Bot commented on SAMOA-49:
-------------------------------------
Github user bhupeshchawda commented on the issue:
https://github.com/apache/incubator-samoa/pull/55
@nicolas-kourtellis Please find my responses below:
1) The slow execution is a deliberate (although temporary) configuration
done in samoa-apex by limiting the number of tuples in an application window.
This has to do with the way iteration works in Apex, which is tightly coupled
to windowing. In case we don't limit the number of tuples, the tuples in a
particular window keep on increasing due to the additional tuples that are fed
on the iteration loop back stream. If the number of iterations is large enough,
the amount of time taken to process a window of data increases beyond normal
behaviour and the operator is killed by the Apex app master. I am working on
identifying some workaround either to eliminate this limit, or to optimally set
this limit.
2) The execution in local mode of Apex is highly asynchronous with all
operators in the topology running in different threads. The local mode of
Samoa, on the other hand seems to be synchronous; i.e. the next tuple is
processed only when the first one has been processed completely by all
operators. I also tried to check executing the local mode of Storm, which also
produces different results every time it is run for the same input file.
3) I think this is due to the same reason in (2)
4) Yes, these changes are necessary for Apex to function correctly. Apex
relies on Kryo serialization (without any fall back on Java serialization) and
hence is necessary for classes to have a default constructor. I think it will
be better to have them as part of this PR. May be I can split them into a
different commit if that helps?
Thanks!
> Add an Adapter for Apache Apex
> ------------------------------
>
> Key: SAMOA-49
> URL: https://issues.apache.org/jira/browse/SAMOA-49
> Project: SAMOA
> Issue Type: New Feature
> Reporter: Bhupesh Chawda
>
> Apache Apex is a new data-in-motion platform that unifies stream processing
> as well as batch processing. An Apache Apex adapter for Samoa would allow
> users to run streaming machine learning algorithms built on Apache Samoa, on
> Apache Apex platform. This adapter should be able to translate the Apache
> Samoa topologies into Apache Apex DAGs in order to run them on the Apex
> platform.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)