Re: link in the Apex documentation is not working

2018-11-07 Thread Thomas Weise
The correct link is: https://github.com/apache/apex-malhar/blob/master/examples/sql/src/main/java/org/apache/apex/malhar/sql/sample/PureStyleSQLApplication.java Would you like to submit a pull request to fix the problem? Here is the source: https://github.com/apache/apex-malhar/blob/master/docs/ap

Re: Catastrophic Error: Execution halted due to Kryo exception! Encountered unregistered class ID 12

2018-08-22 Thread Thomas Weise
Do you see any other errors before you get this exception? If not, try to use a different stream codec: https://github.com/apache/apex-malhar/blob/master/library/src/main/java/org/apache/apex/malhar/lib/codec/KryoSerializableStreamCodec.java On Tue, Aug 21, 2018 at 2:12 PM Oliver Winke wrote:

Re: Before I start re-inventing the wheel: AbstractFileInputOperator

2018-06-21 Thread Thomas Weise
Did you already look at FileSplitter/BlockReader? https://apex.apache.org/docs/malhar/operators/file_splitter/ Would that better support your customization requirements? -- sent from mobile On Thu, Jun 21, 2018, 9:29 PM Aaron Bossert wrote: > folks, > > I have been working with > com.datat

Re: Operator checkpointing not working

2018-06-18 Thread Thomas Weise
The default checkpoint interval is 30s and the interval between failing aggregators is approximately 10s? In that case, no state will ever get checkpointed and operator reset to initial state. Thomas -- sent from mobile On Mon, Jun 18, 2018, 12:45 PM Mateusz Zakarczemny wrote: > Hi Pramod, > I

Re: TimeBasedDedupOperator failing with NullPointer

2018-05-17 Thread Thomas Weise
Hi Vivek, Did you end up fixing the issue? If so, would you like to contribute the fix back to Apex? Thanks, Thomas On Fri, Mar 23, 2018 at 11:05 AM, Vivek Bhide wrote: > Hi Vlad, > > I have filed a JIRA https://issues.apache.org/jira/browse/APEXMALHAR-2557 > > but I am not sure about all the

Re: Kryo version and default serializer

2018-05-16 Thread Thomas Weise
Hi, It is not necessary to use a different stream codec when you have control over the class that is causing the error. You can annotate a different field serializer instead, using the @FieldSerializer.Bind annotation. Here is an example: https://github.com/apache/apex-malhar/blob/2fe2903bfe65055

REST server for Apex application management

2018-05-08 Thread Thomas Weise
Hi, I wanted to make you aware that there is a base implementation for a REST service to manage Apex applications here: https://github.com/atrato/atrato-server It is ASL and everyone is welcome to use it or contribute to it. If there is sufficient interest it can also be discussed to add it to ap

Re: High CPU % even after increasing VCORES

2018-05-01 Thread Thomas Weise
To utilize multiple VCORES, multiple threads would be required. Why not increase the parallelism (partition count)? Thanks, Thomas On Tue, May 1, 2018 at 8:29 PM, Vlad Rozov wrote: > I am not sure I understand your assumptions regarding VCORES and CPU usage > and/or number of tuples emitted pe

Re: Recommended way to query hive and capture output

2017-10-23 Thread Thomas Weise
Have you explored using the JDBC interface to read from Hive? Apex has JDBC connectors and it should be possible to use them with Hive. Thanks, Thomas On Mon, Oct 23, 2017 at 11:52 AM, Vivek Bhide wrote: > Hi > > I would like to know if there is any recommended way to execute queries on > hiv

Re: how to find application master URL ( non proxy )

2017-09-26 Thread Thomas Weise
ne of > the desirable feature. The limitation on prod environment is that we don't > access to all the workers nodes where potentially the AppMaster could be > running and GUI runs from edge node. > > Sunil Parmar > > On Tue, Sep 26, 2017 at 1:23 PM, Thomas Weise wrote:

Re: how to find application master URL ( non proxy )

2017-09-26 Thread Thomas Weise
Once the app is running you get the AM REST host/port from the app info, such as: apex (application_1506456479682_0001) > get-app-info ... "appMasterTrackingUrl": "apex-sandbox:35463", ... Have a look at the Apex CLI or here for details on how the RE

Re: Malhar input operator for connecting to Teradata

2017-08-28 Thread Thomas Weise
Welcome. The referenced PR: https://github.com/apache/apex-malhar/pull/652 On Mon, Aug 28, 2017 at 5:46 PM, Vivek Bhide wrote: > Thanks Thomas for workaround and heads up about the bugs :) > > Let me see if I can get this working in my usecase > > Regards > Vivek > > > > -- > View this message

Re: Malhar input operator for connecting to Teradata

2017-08-28 Thread Thomas Weise
The JDBC poll operator uses a query translator that does not support all dialects in the OS version. For a one-off with Oracle, the following override for post processing did the trick: @Override protected String buildRangeQuery(Object key, int offset, int limit) { String que

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Thomas Weise
There are couple bugs that were recently identified that look related to this: https://issues.apache.org/jira/browse/APEXMALHAR-2526 https://issues.apache.org/jira/browse/APEXCORE-767 Perhaps the fix for first item is what you need? Thomas On Thu, Aug 10, 2017 at 8:41 PM, Pramod Immaneni wrot

Re: Apache Apex with Apache slider

2017-08-04 Thread Thomas Weise
You can assemble it without writing code, since Slider is a script based environment. Thomas On Fri, Aug 4, 2017 at 4:51 AM, Vlad Rozov wrote: > What will be a reason to use Apex cli over Launcher API? > > Thank you, > > Vlad > > On 8/1/17 11:59, Thomas Weise wrote:

Re: How to consume from two different topics in one apex application

2017-08-03 Thread Thomas Weise
I don't think you can use both Kafka client 0.8.x and 0.9.x within the same application. The dependencies overlap and will conflict. You can use 0.8 client to talk to 0.9 server but since you want to use SSL that's not possible (security was only added in 0.9). I have not tried that, but you might

Re: Apache Apex with Apache slider

2017-08-01 Thread Thomas Weise
Slider let's you package the binaries (along with the .apa), so that would eliminate the need for any extra install. On Tue, Aug 1, 2017 at 11:54 AM, Pramod Immaneni wrote: > You might be able to launch the application by running the apex cli as a > separate process from your slider code and pa

Re: Apache Apex with Apache slider

2017-08-01 Thread Thomas Weise
Sounds like what you want is a Slider application that monitors the Apex application? Yes, that would be possible. The Slider app/script could use the RM to locate the app and check the YARN status and then through the Apex AM REST API to poll the Apex metrics. Thomas On Tue, Aug 1, 2017 at 11:

Kafka producer does not propagate error when unable to connect to broker

2017-07-16 Thread Thomas Weise
I noticed that when the Kafka output operator (0.9) is configured with a wrong port or missing the security setup required by the cluster, it will be in "ACTIVE" state but not do anything, essentially ignoring the error. I see that part of the problem is that the producer is async. That makes me w

Re: Apex HTTP Authentication

2017-07-12 Thread Thomas Weise
I will open a PR. On Wed, Jul 12, 2017 at 10:07 PM, Pramod Immaneni wrote: > Looks like the disable behavior is a bug, could you file a JIRA? > > Thanks > > On Wed, Jul 12, 2017 at 9:36 PM, Thomas Weise wrote: > >> I'm working on a secure cluster that has a

Apex HTTP Authentication

2017-07-12 Thread Thomas Weise
I'm working on a secure cluster that has authentication enabled for the YARN services. In my Apex setup, I have: apex.attr.STRAM_HTTP_AUTHENTICATION DISABLE "DISABLE - Disable authentication for web services." That's not what happens though, it rather follows the Hadoop setting and

Re: Kafka operators with Kerberos authentication

2017-07-08 Thread Thomas Weise
naging the kafka deployment through the distro? > > Thanks > > On Thu, Jul 6, 2017 at 6:05 PM, Thomas Weise wrote: > >> Hi, >> >> Has anyone run the Apex Kafka consumer or producer with security enabled? >> >> I got authentication working in embedded

Kafka operators with Kerberos authentication

2017-07-06 Thread Thomas Weise
Hi, Has anyone run the Apex Kafka consumer or producer with security enabled? I got authentication working in embedded mode and looking to deploy to the cluster. It will require * keytab * JAAS config with the KafkaClient settings. * JVM option -Djava.security.auth.login.config=./kafka_client_j

Re: How to consume from Kafka-V10 (SSL enabled) topic using Apex

2017-07-03 Thread Thomas Weise
You can use the operators that work with 0.9 Kafka client to access 0.10 Kafka cluster (Malhar release 3.7 has only operators that work with 0.9 client, next release will have 0.10 client support). Thomas On Mon, Jul 3, 2017 at 4:52 AM, Chaitanya Chebolu wrote: > Hi Rishi, > > You can use Kaf

Re: JDBC poll operator performance

2017-06-27 Thread Thomas Weise
the >> implementation and file format of the given db. >> >> ~ Bhupesh >> >> >> ___ >> >> Bhupesh Chawda >> >> E: bhup...@datatorrent.com | Twitter: @bhupeshsc >> >> www.datatorrent.com | apex.

JDBC poll operator performance

2017-06-26 Thread Thomas Weise
Hi, It seems the poll operator performs unnecessary operations in the case where the "key" column values in the source table are monotonic increasing. There should be no need to sort or do count selects. Instead it should be sufficient to just filter with the key range. Let's say the key column i

Re: What is recommended way to achieve exactly once tuple processing in case of operator failure scenario

2017-06-17 Thread Thomas Weise
There is no way to avoid reprocessing tuples if the goal is to achieve exactly-once results. Please have a look at http://apex.apache.org/docs.html - presentation about fault tolerance and blog "End-to-end exactly-once". Platform alone cannot guarantee exactly-once results. Operators mutating stat

Re: Windowing

2017-06-09 Thread Thomas Weise
Here is an example that uses the window operator to compute top words and time series from Tweets (and optionally outputs the results to websocket): https://github.com/tweise/apex-samples/tree/master/twitter On Thu, Jun 8, 2017 at 11:11 AM, Tushar Gosavi wrote: > You can find more apex exampl

Re: trying to run Apex on Hadoop cluster

2017-06-06 Thread Thomas Weise
Claire, There shouldn't be a need to run the pipeline like this since the Apex runner already has the support to launch hadoop with the required dependencies. Can you please confirm that you are able to run the basic word count example as shown here: https://beam.apache.org/documentation/runners

Re: To run Apex runner of Apache Beam

2017-06-02 Thread Thomas Weise
It may also be helpful to enable logging to see if there are any exceptions in the execution. The console log only contains output from Maven. Is logging turned off (or a suitable slf4j backend missing)? On Fri, Jun 2, 2017 at 5:00 PM, Thomas Weise wrote: > Hi Claire, > > The sta

Re: To run Apex runner of Apache Beam

2017-06-02 Thread Thomas Weise
Hi Claire, The stack traces suggest that the application was launched and the Apex DAG is deployed and might be running (it runs in embedded mode). Do you see any output? In case it writes to files, anything in output or output staging directories? Thanks, Thomas On Thu, Jun 1, 2017 at 3:18 PM

Re: Apex + DT RTS

2017-05-10 Thread Thomas Weise
You can find binary download options here: http://apex.apache.org/downloads.html#third-party-downloads If you are looking for just the Apex CLI binaries: https://github.com/atrato/apex-cli-package/releases/ Thomas On Wed, May 10, 2017 at 3:57 PM, apex user wrote: > So, does it mean that I c

Re: One Year Anniversary of Apache Apex

2017-04-26 Thread Thomas Weise
>> > >> > >> > ___ >> > >> > Bhupesh Chawda >> > >> > E: bhup...@datatorrent.com | Twitter: @bhupeshsc >> > >> > www.datatorrent.com | apex.apache.org >> &

One Year Anniversary of Apache Apex

2017-04-25 Thread Thomas Weise
Hi, It's been one year for Apache Apex as top level project, congratulations to the community! I wrote this blog to reflect and look ahead: http://www.atrato.io/blog/2017/04/25/one-year-apex/ Your comments and suggestions are welcome. Thanks, Thomas

Re: set attribute of type Map in properties.xml

2017-04-14 Thread Thomas Weise
Ram, Is there are reason why this is not reflected here: http://apex.apache.org/docs/apex/application_packages/ Thanks, Thomas On Wed, Apr 12, 2017 at 5:35 PM, Munagala Ramanath wrote: > http://docs.datatorrent.com/application_packages/ > > Please take a look at the "Operator properties" s

[ANNOUNCE] Apache Apex Malhar 3.7.0 released

2017-04-05 Thread Thomas Weise
Dear Community, The Apache Apex community is pleased to announce release 3.7.0 of the Apex Malhar library. Apache Apex is an enterprise grade big data-in-motion platform that unifies stream and batch processing. Apex was built for scalability and low-latency processing, high availability and oper

Re: Need suggestion about temporary file usage on container node

2017-03-26 Thread Thomas Weise
lized for Operator:* > ls: cannot access '/tmp/hadoop-vikram/nm-local-d > ir/usercache/vikram/appcache/application_1490062699498_0013/ > container_1490062699498_0013_01_03/py4j-0.10.4.tar.gz': No such file > or directory > > Thanks & Regards, > Vikram > > On Sun, Mar

Re: 3.5.0 apex core build failing with hadoop 2.7.3 dependency

2017-03-23 Thread Thomas Weise
Bummer. Thanks Ram, I was just going to look at it. The binaries work on 2.7.x though. Will the issue only show in secure mode? Please create a JIRA in APEXCORE. Chinmay, you may need to apply a patch as part of your build to be able to use the 3.5.0 sources. Thomas On Thu, Mar 23, 2017 at 7:1

Re: Apex installation instructions

2017-03-22 Thread Thomas Weise
Hi, Those instructions are indeed for running in a dev environment only. There are a few other download options: http://apex.apache.org/downloads.html The Bigtop binaries can be used, but they are designed to also install Hadoop. I believe you are looking to just install the Apex CLI on an edge

Re: Need suggestion about temporary file usage on container node

2017-03-18 Thread Thomas Weise
uld you suggest a way so that I can figure out the path > to particular localized file on container node preferably using > OperatorContext ? > > > On Fri, Mar 17, 2017 at 8:06 PM, Thomas Weise wrote: > >> If you use LIB_JARS or FILES, then those files will be localized by YA

Re: JavaDocs for Apex Core now available

2017-03-17 Thread Thomas Weise
Yay! Will update the web site soon. -- sent from mobile https://ci.apache.org/projects/apex-core/apex-core- javadoc-release-3.4/index.html https://ci.apache.org/projects/apex-core/apex-core- javadoc-release-3.5/index.html ... at the above links. Ram -- _

Re: Need suggestion about temporary file usage on container node

2017-03-17 Thread Thomas Weise
If you use LIB_JARS or FILES, then those files will be localized by YARN on the container node, you don't need to manually copy them from HDFS or write cleanup code for it. Thomas On Fri, Mar 17, 2017 at 12:49 AM, vikram patil wrote: > Hello All, > > I am working on operator which would downloa

Re: One-time Initialization of in-memory data using a data file

2017-01-23 Thread Thomas Weise
First link without frame: https://ci.apache.org/projects/apex-malhar/apex-malhar-javadoc-release-3.6/org/apache/apex/malhar/lib/state/managed/AbstractManagedStateImpl.html On Mon, Jan 23, 2017 at 3:33 PM, Thomas Weise wrote: > Roger, > > An Apex operator typically holds state that it

Re: One-time Initialization of in-memory data using a data file

2017-01-23 Thread Thomas Weise
Roger, An Apex operator typically holds state that it uses for processing and often that state is mutable. For large state: "Managed state" in Malhar (and its predecessor HDHT) were designed for large state that can be mutated efficiently under a specific write pattern (semi ordered keys). However

[ANNOUNCE] Apache Apex Core 3.5.0 released

2016-12-19 Thread Thomas Weise
Dear Community, The Apache Apex community is pleased to announce release 3.5.0 of Apex Core (the engine). Apache Apex is an enterprise grade big data-in-motion platform that unifies stream and batch processing. Apex was built for scalability and low-latency processing, high availability and opera

[ANNOUNCE] Apache Apex Malhar 3.6.0 released

2016-12-08 Thread Thomas Weise
Dear Community, The Apache Apex community is pleased to announce release 3.6.0 of the Malhar library. The release resolved 70 JIRAs . The release adds first iteration of SQL support via Apache Calcite. Features include SELECT, INSERT, INNER JOIN with non-empty equi join

Re: [EXTERNAL] Re: KafkaSinglePortInputOperator

2016-12-08 Thread Thomas Weise
If this was fixed, why is the JIRA still open? On Wed, Dec 7, 2016 at 10:58 PM, Chaitanya Chebolu < chaita...@datatorrent.com> wrote: > There was a bug in Malhar-3.4.0 and is fixed in Malhar-3.6.0. > JIRA details of this issue is here > . > >

Re: Connection refused exception

2016-11-29 Thread Thomas Weise
There was a bug in the 3.5 release, this could be it. We fixed it and put it into 3.5.1-snapshot. The detail are in JIRA, I don't have access to it right now. -- sent from mobile On Nov 29, 2016 6:02 AM, "Feldkamp, Brandon (CONT)" < brandon.feldk...@capitalone.com> wrote: > This pops up a little

Re: ClassNotFoundException for com.ning.http.client.AsyncHttpClientConfig

2016-11-25 Thread Thomas Weise
Vikram, Can you list all the jar files in your application package. Is it possible that you are using an old version of Malhar library with Apex core 3.4 or later? That class has been removed in 3.4. You can only use version Malhar 3.4 and later with core 3.4. Thanks, Thomas On Thu, Nov 24, 2016

Re: JAVA API Documentation : 404

2016-11-22 Thread Thomas Weise
That's great. We can then also update the web site to point there. -- sent from mobile On Nov 22, 2016 5:13 PM, "Munagala Ramanath" wrote: > https://ci.apache.org/projects/apex-malhar/apex- > malhar-javadoc-release-3.5/index.html > > Please use that link for Apex Malhar release 3.5. I'll post a

Re: Example using Streaming SQL ?

2016-11-18 Thread Thomas Weise
Hi, This is a new feature and the first cut will go into release 3.6. There are examples here: https://github.com/apache/apex-malhar/tree/master/demos/sql Documentation isn't in place yet. Chinmay, do you have some more information to share? Thanks, Thomas On Fri, Nov 18, 2016 at 5:58 PM, Sha

Re: [EXTERNAL] kafka

2016-11-01 Thread Thomas Weise
IMO this should be added here: http://apex.apache.org/docs/malhar/operators/kafkaInputOperator/ On Tue, Nov 1, 2016 at 5:24 PM, hsy...@gmail.com wrote: > Hey Raja, > > The setup for secure kafka input operator is not easy. You can follow > these steps. > 1. Follow kafka document to setup your

Re: KryoException to write Hbase

2016-10-20 Thread Thomas Weise
Please make sure that your Gson parser is a transient member of the operator. On Thu, Oct 20, 2016 at 2:33 PM, Bandaru, Srinivas < srinivas.band...@optum.com> wrote: > Hi, > > We are building a an apex (Datatorrent) application to write into Hbase. > Getting the below error while launching the ap

Re: Datatorrent operator for Hbase

2016-10-20 Thread Thomas Weise
This may also help: http://docs.datatorrent.com/troubleshooting/#hadoop-dependencies-conflicts On Thu, Oct 20, 2016 at 11:39 AM, Thomas Weise wrote: > Please see the HBase dependency and its exclusions here: > > https://github.com/apache/apex-malhar/blob/master/contrib/pom.xml#L342 &

Re: Datatorrent operator for Hbase

2016-10-20 Thread Thomas Weise
gt;> >> >> On Thu, Oct 20, 2016 at 11:59 AM, Jaspal Singh >> wrote: >> > Hi Thomas, Thanks for sharing this example code. >> > Still I couldn't see where the hbase tablename is configured, it says >> in >> > description that it can be configure

Re: Datatorrent operator for Hbase

2016-10-19 Thread Thomas Weise
Here is an example that uses HBase that may be helpful: https://github.com/apache/apex-malhar/blob/master/demos/twitter/src/main/java/com/datatorrent/demos/twitter/TwitterDumpHBaseApplication.java Thomas On Wed, Oct 19, 2016 at 6:36 PM, Jaspal Singh wrote: > Where I need to set the table name.

Re: balanced of Stream Codec

2016-10-15 Thread Thomas Weise
Without knowing the operations following the indeterministic partitioning, assume that you cannot have exactly-once results because processing won't be idempotent. If there are only stateless operations, then it should be OK. If there are stateful operations (windowing with any form of aggregation

Re: DT Fault Tolerance UHG

2016-10-13 Thread Thomas Weise
If the tuples processed by the output operator are not of type String, then the recovery code may fail because it attempts to interpret the messages that were already stored as String. That's a bug in the operator. The workaround is to convert the object to String in the upstream operator and then

Re: KafkaSinglePortExactlyOnceOutputOperator

2016-10-12 Thread Thomas Weise
Yes, it may be better to pull out the topic selection into an upstream operator that emits the messages on separate ports per topic and then you can use the exactly-once output operator without customization. On Wed, Oct 12, 2016 at 11:29 AM, Bandaru, Srinivas < srinivas.band...@optum.com> wrote:

Re: KafkaSinglePortExactlyOnceOutputOperator

2016-10-12 Thread Thomas Weise
The operator recovery logic assumes that data is written to a single topic. This may happen because it is writing multiple? On Wed, Oct 12, 2016 at 11:25 AM, Bandaru, Srinivas < srinivas.band...@optum.com> wrote: > Hi, Need some help with the errors we are having with” KafkaSinglePort > *Exactl

Re: DataTorrent application Parllel processing

2016-10-11 Thread Thomas Weise
There is some draft documentation for 0.9 here: https://github.com/chaithu14/incubator-apex-malhar/blob/APEXMALHAR-2242-kafka0.9doc/docs/operators/kafkaInputOperator.md#09-version-of-kafka-input-operator On Tue, Oct 11, 2016 at 3:21 PM, Thomas Weise wrote: > Hi, > > There i

Re: DataTorrent application Parllel processing

2016-10-11 Thread Thomas Weise
Hi, There is general information on partitioning here: http://apex.apache.org/docs/apex/application_development/#partitioning and a more recent presentation here: http://www.slideshare.net/ApacheApex/smart-partitioning-with-apache-apex-webinar For MapR streams you are using the 0.9 operator, a

Re: Datatorrent fault tolerance

2016-10-07 Thread Thomas Weise
gt; javax.validation.ConstraintViolationException: Operator kafkaOut violates >>>>> constraints >>>>> [ConstraintViolationImpl{rootBean=com.example.datatorrent.KafkaSinglePortExactlyOnceOutputOperator@4726f93f, >>>>> propertyPath='topic', me

Re: Datatorrent fault tolerance

2016-10-07 Thread Thomas Weise
> >>>> After building the application, it throws error during launch: >>>> >>>> An error occurred trying to launch the application. Server message: >>>> java.lang.NoClassDefFoundError: Lkafka/javaapi/producer/Producer; at >>>> java.lang.Class.

Re: Datatorrent fault tolerance

2016-10-07 Thread Thomas Weise
Oct 7, 2016 at 10:42 AM, Jaspal Singh > wrote: > >> Thomas, >> >> I was trying to refer to the input from previous operator. >> >> Another thing when we extend the AbstractKafkaOutputOperator, do we need >> to specify ? Since we are getting an object

Re: Datatorrent fault tolerance

2016-10-07 Thread Thomas Weise
: > Hi Thomas, > > I have a question, so when we are using > *KafkaSinglePortExactlyOnceOutputOperator* to write results into > maprstream topic will it be able to read messgaes from the previous > operator ? > > > Thanks > Jaspal > > On Thu, Oct 6, 2016 at 6:

Re: Datatorrent fault tolerance

2016-10-06 Thread Thomas Weise
you mentioned about checkpointing the offset ranges > to replay in same order from kafka. > > Is there any property we need to configure to do that? like initialOffset > set to APPLICATION_OR_LATEST. > > > Thanks > Jaspal > > > On Thursday, October 6, 2016, Thomas We

Re: Datatorrent fault tolerance

2016-10-06 Thread Thomas Weise
any previous operators fail ? How we can make sure they also > recover using EXACTLY_ONCE processing mode ? > > > Thanks > Jaspal > > > On Thursday, October 6, 2016, Thomas Weise wrote: > >> In that case please have a look at: >> >> https://github.com/apa

Re: Datatorrent fault tolerance

2016-10-06 Thread Thomas Weise
:37 PM, Jaspal Singh wrote: > Hi Thomas, > > In our case we are writing the results back to maprstreams topic based on > some validations. > > > Thanks > Jaspal > > > On Thursday, October 6, 2016, Thomas Weise wrote: > >> Hi, >> >> which

Re: Datatorrent fault tolerance

2016-10-06 Thread Thomas Weise
is same as the Datatorrent documentation but it’s not > working. > > > > Thanks, > > Srinivas > > > > > > *From:* Thomas Weise [mailto:t...@apache.org] > *Sent:* Thursday, October 06, 2016 3:03 PM > *To:* users@apex.apache.org > *Subject:* Re: Data

Re: Datatorrent fault tolerance

2016-10-06 Thread Thomas Weise
Hi, It would be necessary to know a bit more about your application for specific recommendations, but from what I see above, a few things don't look right. It appears that you are setting the processing mode on the input operator, which only reads from Kafka. Exactly-once is important for operato

Re: Apex and Malhar Java 8 Certified

2016-10-05 Thread Thomas Weise
>>> *Organization: *DataTorrent >>> *Reply-To: *"users@apex.apache.org" >>> *Date: *Monday, October 3, 2016 at 11:43 PM >>> *To: *"users@apex.apache.org" >>> *Subject: *Re: Apex and Malhar Java 8 Certified >>> >>&

Re: Apex and Malhar Java 8 Certified

2016-10-05 Thread Thomas Weise
Java 8 Certified > > > > We do test on Java 8 - both Apex Core and Malhar Apache Jenkins builds use > JDK 1.8 to run tests. > > Thank you, > > Vlad > > On 10/3/16 15:45, Thomas Weise wrote: > > Apex is built against Java 7 and expected to work as is on Java

Re: Apex and Malhar Java 8 Certified

2016-10-03 Thread Thomas Weise
Apex is built against Java 7 and expected to work as is on Java 8 (Hadoop distros still support 1.7 as well). Are you running into specific issues? Thanks, Thomas On Mon, Oct 3, 2016 at 12:06 PM, McCullough, Alex < alex.mccullo...@capitalone.com> wrote: > Hey All, > > > > I know there were talks

Re: datatorrent gateway application package problem

2016-09-29 Thread Thomas Weise
Glad you have it working now. And yes, the gateway (like the apex CLI) only depends on the hadoop script. Different distributions of Hadoop have different ways of managing their bits and we want to be agnostic. So if you would like to use a different conf dir, then you would make that change inside

Re: datatorrent gateway application package problem

2016-09-26 Thread Thomas Weise
Hi, Can you try to launch the application using apex CLI instead of the UI? That might help to determine if it is a problem with the Hadoop install or the gateway: http://apex.apache.org/docs/apex/apex_cli/ Thanks, Thomas On Mon, Sep 26, 2016 at 5:04 PM, David Yan wrote: > Added back users@

Re: Can Apex launch command be submitted without using Apex CLI interactively?

2016-09-13 Thread Thomas Weise
Hi, The apex CLI an also be used in batch mode -e you can also use a script file like so: apex < mycmds.txt Other apex CLI documentation is here: http://apex.apache.org/docs/apex/apex_cli/ We will add above to it. HTH, Thomas On Tue, Sep 13, 2016 at 1:02 PM, Meemee Bradley wrote: > Hi

Re: Optional output ports.

2016-09-13 Thread Thomas Weise
> > > > Thanks, > > Alex > > > > *From: *Thomas Weise > *Reply-To: *"users@apex.apache.org" > *Date: *Tuesday, September 13, 2016 at 11:58 AM > > *To: *"users@apex.apache.org" > *Subject: *Re: Optional output ports. > > > > Ale

Re: Optional output ports.

2016-09-13 Thread Thomas Weise
is true, but there is a requirement > implemented by the validator that negates this unless you explicitly set it > to on all your output ports (to the default no less). > > > > > > Thanks, > > Alex > > > > *From: *Thomas Weise > *Reply-To: *"users@ap

Re: Optional output ports.

2016-09-13 Thread Thomas Weise
That's right, if there are multiple output ports, the validation demands that at least one is connected. I actually think this validation is incorrect. It should be up to the application developer to decide how the output of an operator is consumed. It is similar to the return value of a function

Re: HDHT operator

2016-09-12 Thread Thomas Weise
t; > On Mon, Sep 12, 2016 at 5:43 PM, Thomas Weise > wrote: > >> Here is an example for the equivalent functionality in Malhar: >> >> https://github.com/apache/apex-malhar/blob/master/benchmark/ >> src/main/java/com/datatorrent/benchmark/state/StoreOperator.java >

Re: HDHT operator

2016-09-12 Thread Thomas Weise
there are also a couple ready to use operators that use this state management facility. Thomas On Mon, Sep 12, 2016 at 5:34 PM, Thomas Weise wrote: > Naresh, > > Which doc references this operator? Malhar has a replacement for this. Can > you share some more info about your use cas

Re: HDHT operator

2016-09-12 Thread Thomas Weise
Naresh, Which doc references this operator? Malhar has a replacement for this. Can you share some more info about your use case so that we can point you to the appropriate operator to start from? Thanks, Thomas On Mon, Sep 12, 2016 at 5:13 PM, Naresh Guntupalli < naresh.guntupa...@gmail.com> wr

Re: Kyrp exception

2016-09-07 Thread Thomas Weise
The stack trace shows that a resultSet field is part of the checkpointed state. I don't see this property in the latest release (3.5.0). Which version of Malhar are you using? On Wed, Sep 7, 2016 at 2:16 PM, Jaikit Jilka wrote: > Hello, > > I did not clearly understand you question but in JDBC

[ANNOUNCE] Apache Apex Malhar 3.5.0 released

2016-09-05 Thread Thomas Weise
Dear Community, The Apache Apex community is pleased to announce release 3.5.0 of the Malhar library. The release resolved 63 JIRAs and comes with exciting new features and enhancements, including: - High level Java stream API now supports stateful transformations with Apache Beam style windowi

Re: Apex Application failing to run.

2016-09-04 Thread Thomas Weise
Since it is the virtual memory, please see: http://docs.datatorrent.com/configuration/#hadoop-tuning -- sent from mobile On Sep 4, 2016 12:10 PM, "Munagala Ramanath" wrote: > It looks like there is not enough memory for all the containers. If you're > running in the sandbox, how much memory is

Re: HDHT question - looking for the datatorrent gurus!

2016-09-03 Thread Thomas Weise
rrect windowing, etc? > > > > Jim > > > > *From:* Thomas Weise [mailto:thomas.we...@gmail.com] > *Sent:* Saturday, September 3, 2016 11:47 AM > > *To:* users@apex.apache.org > *Subject:* Re: HDHT question - looking for the datatorrent gurus! > > > > Jim

Re: HDHT question - looking for the datatorrent gurus!

2016-09-03 Thread Thomas Weise
Jim, You need to use the delay operator for iterative processing. Here is an example: https://github.com/apache/apex-malhar/blob/master/demos/iteration/src/main/java/com/datatorrent/demos/iteration/Application.java Thomas On Sat, Sep 3, 2016 at 9:05 AM, Jim wrote: > Tushar, > > I am trying to

Re: HDHT question - looking for the datatorrent gurus!

2016-09-01 Thread Thomas Weise
The user defines how to convert key and value into byte[], so you can use any serialization mechanism you like (custom, Kryo, JSON, etc.). Here is an example for setting up the the serializer: https://github.com/DataTorrent/examples/blob/master/tutorials/hdht/src/test/java/com/example/HDHTTestOpe

Re: HDHT question - looking for the datatorrent gurus!

2016-09-01 Thread Thomas Weise
HDHT is for *embedded* state management, fully encapsulated by the operator. It cannot be used like a central database. Thanks, Thomas On Thu, Sep 1, 2016 at 9:49 AM, Tushar Gosavi wrote: > Hi Jim, > > Currently HDHT is accessible only to single operator in a DAG. Single > HDHT store can not b

Re: Support for dynamic topology

2016-08-30 Thread Thomas Weise
Hi, This depends on the operator Z. If it has multiple input ports and those are optional, you can add P, then connect P to Z (and X), then remove Y. If Z has a single port, then Z and everything downstream would need to be removed or else the change won't result in a valid DAG and won't be accep

Re: kryo Serealization Exception

2016-08-22 Thread Thomas Weise
ute(DTCli.java:2050) > at com.datatorrent.stram.cli.DTCli.launchAppPackage(DTCli.java:3456) at > com.datatorrent.stram.cli.DTCli.access$7100(DTCli.java:106) at > com.datatorrent.stram.cli.DTCli$LaunchCommand.execute(DTCli.java:1895) at > com.datatorrent.stram.cli.DTCli$3.run(DTCli.java:144

Re: kryo Serealization Exception

2016-08-22 Thread Thomas Weise
There is some information available here: http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception If the object is Java serializable, you can set the stream codec or wrap into KryoJdkContainer: https://github.com/apache/apex-malhar/tree/master/library/src/main/

Re: Local CLI Logging

2016-08-19 Thread Thomas Weise
Hi Alex, The CLI is automatically changing the threshold for the log4j console appender. You can supply - to the CLI to enable output at debug level. You can also supply your own log4j.properties with customized loggers through the environment (the example below also includes how you enable de

Re: Round Robin Partitioning with Dynamic Partitioning

2016-08-12 Thread Thomas Weise
If idempotency is needed (replay on recovery) and the order of tuples in the stream can change within a window (multiple upstream partitions), then it may be better to use a stateless hash function that ensures even distribution. On Fri, Aug 12, 2016 at 10:10 AM, Munagala Ramanath wrote: > I as

Re: A proposal for Malhar

2016-08-09 Thread Thomas Weise
There are a bunch of operators that don't have proper state management and also don't support generic windowing (event time etc.). I would suggest to move those out or deprecate them. The new windowing and state management support along with the appropriate aggregators is going to make them obsole

Re: Question on KafkaInputOperator - with kafka 0.8.2.1

2016-07-05 Thread Thomas Weise
Hi Eric, You need to use the 0.8.1 version of the client library. It can talk to the 0.8.2 cluster. Thanks, Thomas On Tue, Jul 5, 2016 at 2:16 PM, Martin, Eric wrote: > Hi all, > > > > I am using the 0.8.x version of the Kafka Input Operators. When using > these operators, I am able to run th

Re: Kafka Input Operator questions

2016-06-29 Thread Thomas Weise
Hi Eric, I would recommend to use the latest version of Apex Core (3.4.0) as well as the latest version of Malhar (3.4.0 as well). My responses inline based on above versions: On Wed, Jun 29, 2016 at 5:54 AM, Martin, Eric wrote: > Hi, > > > > I have a few quick questions around the Kafka Input

How to write a unit test that runs embedded cluster and verifies results

2016-06-23 Thread Thomas Weise
Posting it here as the question comes up often. When testing multiple operators in a DAG, how do I know that it "worked"? Following is the anti-pattern: @Test public void testSomeMethod() throws Exception { LocalMode lma = LocalMode.newInstance(); lma.prepareDAG(new Application(), ne

Re: Alerts

2016-06-22 Thread Thomas Weise
You can obtain the application status from YARN. But that's not detailed enough to diagnose what is happening within the application. For that, you can use the Apex app master REST API. If you are using the DT community edition, you get both through the gateway REST API. More information about the

  1   2   >