>From Sunil's note it sounds like the DB is in production use by other
systems during certain hours and wants the Apex based migration to happen
at other (non-overlapping) times.
Ram
On May 23, 2016 2:32 PM, "Priyanka Gugale" wrote:
> Hi Sunil,
>
> I don't think we have exactly pause and resume
t folders ?
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:r...@datatorrent.com]
> *Sent:* 2016, May, 20 2:35 PM
> *To:* us...@apex.incubator.apache.org
> *Subject:* Re: Information Needed
>
>
>
> It appears
s?
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:r...@datatorrent.com]
> *Sent:* 2016, May, 24 12:08 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Information Needed
>
>
>
> For scheduling, there is no
> know this I can read the corresponding configuration file before parsing
> the line.
>
>
>
> Please let me know how do I handle this.
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:r...@datatorrent.com]
> *Sent:* 2016, May,
> Do you have sample usage for partitioning with individual configuration
> set ups different partitions?
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:r...@datatorrent.com]
> *Sent:* 2016, May, 25 12:11 PM
> *To:* users@apex
to use partitioning , I will meanwhile
> try to understand the partitioning.
>
>
>
> Your support is well appreciated.
>
>
>
> Regards,
>
> Surya Vamshi
>
> *From:* Munagala Ramanath [mailto:r...@datatorrent.com
> ]
> *Sent:* 2016, May, 26 7:32 PM
>
"slices" where each slice monitors a single directory.
Ram
On Wed, May 25, 2016 at 9:55 AM, Munagala Ramanath
wrote:
> I'm hoping to have a sample sometime next week.
>
> Ram
>
> On Wed, May 25, 2016 at 9:30 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryava
http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
Please try the suggestions at the above link.
It appears from
https://github.com/uber/confluent-schema-registry/blob/master/avro-serializer/src/main/java/io/confluent/kafka/serializers/KafkaAvroDecoder.jav
Hi,
For Dynamic Partitioning, please take a look at the example at:
https://github.com/DataTorrent/examples/tree/master/tutorials/dynamic-partition
as well as the Malhar class:
https://github.com/apache/apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/partitioner/StatelessThrough
ation.apa file.
> Is it right?
> And the using below commands (apexcli) , we can modify the DAG on the fly.
>
> create-operator operator-name class-name Create an operator create-stream
> stream-name from-operator-name from-port-name to-operator-name to-port-name
> Create a stream
&
You'll need to have some some limit one how a lag is possible for
out-of-order messages.
If that limit is say 30s, then you'll need to buffer tuples for double the
lag -- 60s.
You can configure the Application Window size suitably to do this.
Ram
On Thu, Jun 9, 2016 at 10:40 AM, Raja.Aravapalli
ould be emitted
>> in the operator thread only. This can be done in endWindow()
>>
>> --
>> sent from mobile
>> On Jun 9, 2016 11:46 AM, "Sandesh Hegde" wrote:
>>
>>>
>>> How about something like this,
>>>
>>> Store th
You don't need dynamic partitioning to achieve that topology. You can
simply create your DAG as: A --> X --> Y and then set the *PARTITIONER*
attribute on X
as discussed in the "Advanced Features" section of the TopN words tutorial
at:
http://docs.datatorrent.com/tutorials/topnwords-c7/
The stanza
nt DAG even though it is
> not included in *.apa file.
>
>
> Thanks,
> Junguk
>
>
> 2016-06-10 12:03 GMT-04:00 Munagala Ramanath :
>
>> You don't need dynamic partitioning to achieve that topology. You can
>> simply create your DAG as: A --> X --> Y
Raja,
Can you find the machine running YARN ?
Look for log files on that machine -- usually, they are under
*/var/log/hadoop-yarn* or */var/log/hadoop* or similar locations.
The files themselves will have names that vary depending on your
installation; some examples:
*yarn--resourcemanager-.log*
Could you take a look at the YARN logs ?
Ram
On Tue, Jun 14, 2016 at 5:43 AM, Shubham Pathak
wrote:
> Hello,
>
> I have setup a 3 node hadoop cluster using Cloudera Express 5.7.0. and
> installed Dt RTS community edition datatorrent-rts-3.3.0.bin.
> Total memory available is 24 GB.
>
> However
partition. If the downstream
> operators can keep up with the rate at which the file reader emits, then
> the memory consumption should be minimal. Keep in mind though that the
> container memory is not just heap space for the operator, but also memory
> the JVM requires to run and the memor
f36cb193c1c by jenkins source checksum
> 48db4b572827c2e9c2da66982d14
>
> 7626",
>
> "resourceManagerVersion": "2.7.1.2.3.2.0-2950",
>
>"resourceManagerVersionBuiltOn": "2015-09-30T18:20Z",
>
> "rmStateStoreNam
http://docs.datatorrent.com/configure_IDE/
Is that what you're looking for ?
Ram
On Fri, Jun 17, 2016 at 9:44 AM, Raja.Aravapalli wrote:
>
> Hi,
>
> Can someone help me with, how can I get, apex maven archetype into
> IntelliJ ?
>
> When I select to create a new maven project in IntelliJ, I am
By "relaunches" I assume you mean you're launching with the previous
application ID ?
When you do that, the platform attempts to restore state from the
previously saved serialized form.
To do that, it needs to firstcreate the object with a no-arg constructor
and then populate it.
If that construct
eSerializer()
> ?
>
> Thanks a lot.
>
>
> Regards,
> Raja.
>
> From: Munagala Ramanath
> Reply-To: "users@apex.apache.org"
> Date: Monday, June 20, 2016 at 4:32 PM
> To: "users@apex.apache.org"
> Subject: Re: Application restarts
>
> B
No, I don't have an example but several approaches are possible depending
on the
exact requirements, e.g.:
1. How large is the number of directories ?
2. Is the desired sequence a total order or a partial order (i.e. DAG,
https://en.wikipedia.org/wiki/Partially_ordered_set) ?
If the number of dire
Please look at:
http://docs.datatorrent.com/beginner/ and
http://docs.datatorrent.com/application_packages/
for examples on how to set properties from XML files.
Ram
On Thu, Jun 23, 2016 at 3:25 PM, Jaikit Jilka wrote:
> Hello,
>
> How to pass an empty string for updatecommand in properties.x
This may not be the only issue but you've got "-DskitTests" instead of
"-DskipTests"
Ram
On Thu, Jun 23, 2016 at 4:40 PM, Rajesh Kaushal
wrote:
> Hi,
>
> System Details
> Ubuntu 16.04 LTS running on Dell Inspiron Laptop.
> RAM: 6GB
>
> I cloned apex core and malhar. I am running below command i
Yes, that should be OK -- the constructor and setup methods are called on
recovery from failure.
Ram
On Mon, Jun 27, 2016 at 6:27 AM, McCullough, Alex <
alex.mccullo...@capitalone.com> wrote:
> Hey All,
>
>
>
> As I understand it, the only thing that should not be defined as transient
> in an op
y partition wait for
> trigger from kafka (or) entry in Database, is it inside the
> definepartition? Do you have any sample code for the same. What I am
> currently doing to generate the partition is source property in the
> properties file for each directory. I am processing the each fi
Not sure I fully understand the question but you can add whatever fields
you need
to your class that extends *AbstractFileInputOperator*. For example,
https://github.com/DataTorrent/examples/blob/master/tutorials/fileIO-multiDir/src/main/java/com/example/fileIO/FileReaderMultiDir.java
defines fiel
oryScanner)
> scanners.get(i);
>
> scn.setStartIndex(first);
>
> scn.setEndIndex(last);
>
> scn.setDirectory(dir);
>
>
>
> oper.setScanner(scn);
>
> newPartitions.add(new DefaultPartition<>(oper));
>
> newManagers.add(oper.getIdem
e know ?
>
> In my current definepartition() method , I am doing similarly like below,
> but I have to add setter and getter methods in AbstractFileInputOperator
> class.
>
>
>
> for (int j = 0; j < numDirs; ++j) {
>
> int first = sliceFirstIndex[j];
>
>
Nothing available out-of-the-box but there are some pieces that may be
useful:
https://github.com/apache/apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/io/AbstractHttpInputOperator.java
For the Zip part, there is an example using GZip for output here:
https://github.com/apache/a
To add some additional explanation to what Sandesh said, it looks like your
operator is
dying or getting killed after some time, so you should look at the
Application Master logs to find
out why this is happening.
When it goes down, a new operator is created and state from an earlier
checkpoint is
Please take a look at the sample application at:
https://github.com/amberarrow/samples/tree/master/custom-codec
It uses the StreamCodec that you provided where tuples have the form
"i^j^hello" where i runs sequentially
through the natural numbers and j (the key) cycles through the range
(1..5).
W
Please take a look at the Python script under
https://github.com/DataTorrent/examples/tree/master/tools
It uses the Gateway REST API to retrieve application info given the name;
the id is part of that JSON object.
Ram
On Tue, Jul 12, 2016 at 6:58 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
surya
Please see: http://docs.datatorrent.com/troubleshooting/#configuring-memory
Ram
On Tue, Jul 12, 2016 at 6:57 AM, Raja.Aravapalli wrote:
>
> Hi,
>
> My DAG is failing with memory issues for container. Seeing below
> information in the log.
>
>
>
> Diagnostics: Container [pid=xxx,containerID=cont
he memory
> allocated for my DAG ? Is there is any other way, I can increase the
> memory ?
>
>
> Thanks a lot.
>
>
> Regards,
> Raja.
>
> From: Munagala Ramanath
> Reply-To: "users@apex.apache.org"
> Date: Tuesday, July 12, 2016 at
So with the above settings at cluster level, I can’t increase the memory
>> allocated for my DAG ? Is there is any other way, I can increase the
>> memory ?
>>
>>
>> Thanks a lot.
>>
>>
>> Regards,
>> Raja.
>>
>> From: Munagala Ramana
There is already a link to a troubleshooting page at bottom of
https://apex.apache.org/docs.html
That page already has some discussion under the section entitled
"Calculating Container Memory"
so adding new content there seems like the right thing to do.
Ram
On Mon, Jul 18, 2016 at 11:27 PM, Chin
e location. I shall raise PR with
> the given suggestions.
>
> --prad
>
> On Tue, Jul 19, 2016 at 5:49 AM, Munagala Ramanath
> wrote:
>
> > There is already a link to a troubleshooting page at bottom of
> > https://apex.apache.org/docs.html
> > That page al
One way is to have a pass-through operator X that is parallel partitioned
like your B currently.
Then, connect the output port of X to B and use a suitable partitioner for
B to create as many
partitions as you want: A -> X -> B -> C.
Ram
On Mon, Jul 25, 2016 at 9:41 AM, Yogi Devendra
wrote:
> H
Which version of Hadoop ? If it is 2.6.X, upgrading to 2.7.X might fix the
problem, e.g.
https://marc.info/?l=hadoop-user&m=142501302419779&w=2
Ram
On Tue, Jul 26, 2016 at 1:59 PM, Raja.Aravapalli wrote:
>
> Hi,
>
> My DAG fails every week atleast once with the below information in the
> log.
>
"relinquish" => "acquire" :-)
Ram
On Fri, Aug 5, 2016 at 6:01 AM, Chinmay Kolhatkar
wrote:
> Hi Alex,
>
> setup and activate both are guaranteed to be called before starting of
> dataflow. Hence in your case either should be fine.
>
> There might be few hundred ms of gap between setup call and
Try throwing a ShutdownException (in Operator.java)
Ram
On Tue, Aug 9, 2016 at 7:16 AM, McCullough, Alex <
alex.mccullo...@capitalone.com> wrote:
> Hey All,
>
>
>
> Operators that are based on an underlying data store, when trying to
> connect in setup or activation, if an exception is thrown I
For cases where use of a different thread is needed, it can write tuples to
a queue from where the operator thread pulls them -- JdbcPollInputOperator
in Malhar has an example.
Ram
On Wed, Aug 10, 2016 at 1:50 PM, hsy...@gmail.com wrote:
> Hey Vlad,
>
> Thanks for bringing this up. Is there an
Alex,
Please take a look at:
https://github.com/DataTorrent/examples/blob/master/tutorials/dynamic-partition/src/main/java/com/example/dynamic/Gen.java
It shows an operator that implements both the *Partitioner* and the
*StatsListener* interface.
The *processStats()* method of the latter interfac
16 at 9:53 AM, McCullough, Alex <
alex.mccullo...@capitalone.com> wrote:
> Thanks Ram.
>
>
>
> If I didn’t want dynamic partitioning and just round robin on a fixed # of
> partitions, can it just be set through a property? If so, what is the
> property?
>
>
>
If you are dealing with file data purely as byte arrays and copying them
from one place to another, you need not worry about the language or charset
since the bytes are preserved.
If you are converting them to Strings explicitly or using classes that might
do so implicitly, you need to specify an
We see the following exception w.r.t. partitioning; can you describe how
your code is handling partitioning ?
Ram
---
Exception: Partitioner returns null or empty.
java.lang.IllegalStateException: Partitioner returns null or empt
It looks like there is not enough memory for all the containers. If you're
running in the sandbox, how much memory is allocated to the sandbox ?
Can you try increasing it ?
There is also a detailed discussion of how to manage memory using
configuration
parameters in the properties files at:
http:/
Can you share the code for your file output operator and the properties
file where
you configure the file path and name ?
Please take a look at the examples at
https://github.com/DataTorrent/examples/tree/master/tutorials
in particular at the "fileIO-simple" example for details on how to
configur
Looks like this happens if jersey-server is not present
(e.g.
http://stackoverflow.com/questions/8662919/jersey-no-webapplication-provider-is-present-when-jersey-json-dependency-added
)
Have you made any changes to the pom.xml ? If so, can you post it here ?
Also, can you tell us a bit more about
Yes, if you have ports at least one must be connected if there are no
annotations on them.
The code is in LogicalPlan.validate() -- checkout the allPortsOptional
variable.
Ram
On Tue, Sep 13, 2016 at 3:17 AM, Tushar Gosavi
wrote:
> Hi All,
>
> I have an input operator with one output port with
sed the apex command to lauch the apa file.
>
>
> Thanks
> Sanal
>
> On Tue, Sep 13, 2016 at 1:29 AM, Munagala Ramanath
> wrote:
>
>> Looks like this happens if jersey-server is not present
>> (e.g. http://stackoverflow.com/questions/8662919/jersey-no-
>> webap
ive node failure
> count = 2147483647
> 2016-09-19 01:21:27,348 INFO org.apache.hadoop.yarn.client.RMProxy:
> Connecting to ResourceManager at /0.0.0.0:8032
>
> Thanks for any help.
>
> Best regards
>
> Sanal
>
>
>
> On Mon, Sep 19, 2016 at 6:03 PM, Sanal
gt; Removing container request: [Capability[]Priority[4],
> Capability[]Priority[3], Capability[ vCores:1>]Priority[2]]
> 2016-09-20 09:07:41,512 INFO com.datatorrent.stram.StreamingAppMasterService:
> Removed container: Capability[]Priority[4]
> 2016-09-20 09:07:41,512 INFO com.datat
We now have these system properties [defaults in brackets]:
A. *com.datatorrent.stram.rpc.timeout [5_000]*
B. *com.datatorrent.stram.rpc.delay.timeout [10_000]*
C. *com.datatorrent.stram.rpc.retry.timeout [30_000]*
We also have the attribute:
D. *HEARTBEAT_TIMEOUT_MILLIS [30_000]*
What, if
http://docs.datatorrent.com/configuration/#custom-log4j-properties-for-application-packages
Please take a look at that page for more info.
Ram
On Tue, Oct 4, 2016 at 10:28 AM, Doyle, Austin O. <
austin.o.doyl...@leidos.com> wrote:
> I am trying to update the log4j.properties file for an apex ap
You can use Java 8 but the source and target compatibility configuration
parameters in
your pom.xml for the maven-compiler-plugin still need to be 1.7
Ram
On Wed, Oct 5, 2016 at 9:14 AM, Feldkamp, Brandon (CONT) <
brandon.feldk...@capitalone.com> wrote:
> So is it safe to say that JDK 1.8 is sup
ly works with java 8 classes/types
>>
>> Regards,
>> Siyuan
>>
>>
>>
>> On Wed, Oct 5, 2016 at 9:34 AM, Munagala Ramanath
>> wrote:
>>
>>> You can use Java 8 but the source and target compatibility configuration
>>> parameters
Please post the following:
1. Entire pom.xml
2. output of "mvn dependency:tree"
3. output of "jar tvf " run on your application package file (with .apa
extension)
Ram
On Fri, Oct 7, 2016 at 10:10 AM, Jaspal Singh
wrote:
> Thomas,
>
> We have added the dependency in pom.xml for lafka client API
If you want round-robin distribution which will give you uniform load
across all partitions you can use
a StreamCodec like this (provided the number of partitions is known and
static):
*public class CatagoryStreamCodec extends
KryoSerializableStreamCodec {*
* private int n = 0;*
* @Override*
*
To add to Ashwin's diagnosis, assuming you have *properties.xml* setup as
described in that tutorial,
you need to add to the YARN configuration file *yarn-site.xml* at:
*/sfw/hadoop/current/etc/hadoop*
the following stanza:
*yarn.scheduler.minimum-allocation-mb128*
and restart the Hadoop daemo
>I have set the YARN minimum allocation property *in property.xml of the
>project*, which doesn't solve the problem.
Setting it in your project will not help; you need to set it in
yarn-site.xml
and restart YARN. If you're running on the sandbox, the location of that
file is in my earlier
message.
The application id is not known until the application starts running so
that kind of substitution
likely won't be possible.
Can you not simply check if filePath already ends with the application id
before appending ? e.g. :
*String appid =
Context.OperatorContext.getValue(Context.DAGContext.APPLI
Couple of questions:
Could you outline the steps you followed to run this application ?
Do you have any custom code that is creating or modifying a Configuration
object ?
Can you try building and running some of the example programs at:
https://github.com/DataTorrent/examples/tree/master/tutorials
For Req1, a better approach might be to have a single "error reporting"
operator that connects to the DB;
all the other operators can send custom error tuples to it with appropriate
details of the type of error.
That way, if there are issues that are DB-related, you have 1 place to
debug them inste
Since the diagnostic is about exceeding virtual memory limits, please see:
http://docs.datatorrent.com/configuration/#yarn-vmem-pmem-ratio-tuning
for an alternative solution.
Ram
On Mon, Nov 21, 2016 at 4:20 AM, Max Bridgewater
wrote:
> The issue turned out to be memory allocation. Here is
Could you please provide additional details ? Are you doing this in code or
from a properties file ?
If from a properties file, please paste the complete XML fragment; if from
code please show the
complete code fragment. Where is the $strtDate variable defined ? How are
you determining that
"it is
Usually this sort of error is a symptom of including hadoop jars in your
application package.
Please make sure this is not happening. This link has some info:
http://docs.datatorrent.com/troubleshooting/#hadoop-dependencies-conflicts
If you run "jar tvf target/*.apa | grep hadoop" you'll see whic
https://ci.apache.org/projects/apex-malhar/apex-malhar-javadoc-release-3.5/index.html
Please use that link for Apex Malhar release 3.5. I'll post a link for Apex
Core later.
Ram
On Tue, Nov 22, 2016 at 4:08 PM, Devavrath S wrote:
> Hi All,
>
> I was referring to the JAVA API Documentation whos
The .apa file content can be listed with:* jar tvf ./target/*.apa*
Could you post the output of "*mvn dependency:tree*" ?
Ram
On Fri, Nov 25, 2016 at 4:15 AM, prakashdutt...@hotmail.com <
prakashdutt...@hotmail.com> wrote:
> The apa lists message is showing as below
>
> apex> launch /work/m
One way is to use an external process that invokes appropriate REST API
calls and checks
results. An sample Python script to do this for the application as a whole
is at:
https://github.com/DataTorrent/examples/blob/master/tools/monitor.py
The REST API is documented at: http://docs.datatorrent.com
A brief description is here:
http://docs.datatorrent.com/beginner/#buffer-server
It is indeed part of apex-core; see:
https://github.com/apache/apex-core/tree/master/engine/src/main/java/com/datatorrent/stram/
as well as the *stream* subdirectory.
Ram
On Fri, Nov 25, 2016 at 11:13 AM, Max Bridge
Max,
The classes under *contrib/src/main/java/com/datatorrent/contrib/kafka* use
the old 0.8 Kafka API
whereas those under *kafka/src/main/java/org/apache/apex/malhar/kafka* use
the new 0.9 API.
*KafkaSinglePortInputOperator* (used in your working version) is in the
latter whereas
*AbstractKafkaSi
Just to be sure, did that change resolve the issue ?
Ram
On Fri, Dec 2, 2016 at 11:54 AM, Max Bridgewater
wrote:
> OK. Thanks guys. It continued to fail with min 256 and max 512 or 1024. In
> the end I switched off the check of ratio virtual/physical memory. My
> config is now:
>
>
>yarn.n
To further clarify Bhupesh's comment, suppose you determine in window N in
the input operator the
data reading phase is complete and send the control tuple on the dedicated
port to the output
operator in window N+1. If the downstream operators (including the output
operator) P_i are
processing resp
by the operators? Is it end of every
> window?
>
>
> Thanks,
> Vishal
>
>
>
> On Sat, Dec 3, 2016 at 1:46 PM, Munagala Ramanath
> wrote:
>
> To further clarify Bhupesh's comment, suppose you determine in window N in
> the input operator the
> data
For 1, try launching with the previous application id documented at:
http://docs.datatorrent.com/apexcli/
under the *-originalAppId* parameter of the *launch* command. The starting
offset can also be controlled
by the *initialOffset* configuration property documented at:
http://apex.apache.org/docs
W.r.t. item 3, it may be an issue of whether the group is active or not and
whether it has been flushed from
the memory cache. Some discussion here:
http://grokbase.com/t/kafka/users/15bsxxgp49/new-kafka-consumer-groups-sh-not-showing-inactive-consumer-gorup
Ram
On Thu, Dec 8, 2016 at 8:36 AM, Ar
Once log aggregation is enabled as Priyanka described, you can retrieve
logs for a specific
application with a command like this:
*yarn logs -applicationId application_1473359658420_0588*
Ram
On Mon, Dec 12, 2016 at 2:29 AM, Priyanka Gugale wrote:
> You need to set hadoop configuration "yarn.l
In addition to the methods mentioned by others, there is also the REST API
documented
at: http://docs.datatorrent.com/dtgateway_api/
The calls:
*POST /ws/v2/appPackages?merge={replace|fail|ours|theirs}*
POST
/ws/v2/appPackages/{owner}/{packageName}/{packageVersion}/applications/{appName}/launch[?
Are you reusing objects in the input operator ? If so, try creating a new
object
for each tuple.
Ram
On Tue, Dec 20, 2016 at 5:38 AM, Doyle, Austin O. <
austin.o.doyl...@leidos.com> wrote:
> Also as a follow up, its not just repetition of some of the data points,
> it’s also just not sending som
The operator has this comment at the top:
** This implementation is generic so it uses offset/limit mechanism for
batching which is not optimal. Batching is*
* * most efficient when the tables/views are indexed and the query uses
this information to retrieve data.*
* * This can be achieved in sub-
There are multiple approaches, each with its own tradeoffs. Here is first
step:
A1. Create a pair of in-memory non-transient queues to hold the tuples
(non-transient
because we want them to be checkpointed and restored on recovery
from failure).
A2. Create a separate thread that waits for a
Could you provide more details on what you're running, i.e. which version
of Apex, etc. ?
Also, what is santry ?
Do any of the other REST API calls work, like "/about" for instance ?
Ram
On Mon, Feb 6, 2017 at 8:35 AM, chiranjeevi vasupilli
wrote:
> bringing on top..
>
> On Sun, Feb 5, 2017 at
Ram,
>
> we running Data Torrent applicatoin using apex 3.2.1 build.
> Santry i.e authentication required to access DT console.
>
>
>
> On Mon, Feb 6, 2017 at 10:16 PM, Munagala Ramanath
> wrote:
>
>> Could you provide more details on what you're running, i.e.
Looks like you may be blocking the operator thread with the p.waitFor() and
the rest of the code
to process the child process output.
Try using a separate thread to handle the child as described, for example,
here:
http://stackoverflow.com/questions/26319804/adapting-c-fork-code-to-a-java-program
Please note that tuples should not be emitted by any thread other than the
main operator thread.
A common pattern is to use a thread-safe queue and have worker threads
enqueue
tuples there; the main operator thread then pulls tuples from the queue and
emits them.
Ram
On Tue, Feb 21, 2017 at 10:0
fkaStreamInput =
>>
>> new DefaultInputPort() {
>>
>> List errors = new ArrayList();
>>
>> @Override
>>
>> public void process(String consumerRecord) {
>>
>> //Code for normal tuple process
>>
>> //Code
It likely means that that operator is taking too long to return from one of
the callbacks like beginWindow(), endWindow(),
emitTuples(), etc. Do you have any potentially blocking calls to external
systems in any of those callbacks ?
Ram
On Tue, Feb 28, 2017 at 11:09 AM, Sunil Parmar
wrote:
> 20
operator '
> > > >
> > > > Regards,
> > > > Ashwin.
> > > >
> > > > On Tue, Feb 28, 2017 at 12:03 PM, Sunil Parmar <
> spar...@threatmetrix.com
> > > >
> > > > wrote:
> > > >
> > > > > That d
Could you check the AppMaster logs for any anomalous messages, for
instance, ""Marking operator {} blocked committed window {}"
Ram
On Thu, Mar 9, 2017 at 10:48 PM, rishimishra
wrote:
> hi ,
>
> I am running one of the Apex Application , which takes input from kafka and
> other operator parse t
Take a look at the testCompression() method
in AbstractFileOutputOperatorTest.java for an
example.
Ram
On Mon, Mar 13, 2017 at 5:25 PM, Ganelin, Ilya
wrote:
> What is the recommended way to write compressed data to HDFS? Should I
> extend AbstractFileOutputOperator or is there existing support
https://ci.apache.org/projects/apex-core/apex-core-javadoc-release-3.4/index.html
https://ci.apache.org/projects/apex-core/apex-core-javadoc-release-3.5/index.html
... at the above links.
Ram
--
___
Munagala V. Ramanath
Software Engineer
E:
Please take a look at: http://apex.apache.org/docs.html
The Beginner's Guide is a good place to start. Briefly stated, you'll need
to build your application package
using maven and deploy it using the commandline tool "apex" that is in the
apex-core repository.
Ram
On Wed, Mar 22, 2017 at 4:19 PM
Looks like this was a breaking change introduced in Hadoop 2.7.0:
In
https://hadoop.apache.org/docs/r2.6.5/api/org/apache/hadoop/yarn/conf/YarnConfiguration.html
we have:
static long DELEGATION_TOKEN_MAX_LIFETIME_DEFAULT
static String DELEGATION_TOKEN_MAX_LIFETIME_KEY
But in
https://hadoop.apach
If log file aggregation is enabled, aggregated logs for terminated logs
should be available on HDFS and retrievable with:
*yarn logs -applicationId ***
Log aggregation is discussed further at:
https://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
and is controlled b
http://docs.datatorrent.com/application_packages/
Please take a look at the "Operator properties" section.
Ram
On Wed, Apr 12, 2017 at 4:17 PM, Sunil Parmar
wrote:
> We've a use case where we want to set an operator property of type Map
> (String, Long) in the property xml. Is there a way to
Could you provide some details on what types your multiple keys are and what
expression variants you tried and what the result was ?
The base class of BoundedDedupOperator is AbstractDeduper; within that class
you'll see an method getKey(); you should be able to override that to retrieve
the de
It needs to be an expression that combines both (or all) values: try "id + id1"
Ram
On Monday, October 23, 2017, 6:04:14 PM PDT, Vivek Bhide
wrote:
Thanks Ram for your suggestions
Field types that I am trying are the basic primitive types. In fact, I was
just playing around with the
The mapping: tuple -> dedup-key needs to be 1-1; if multiple tuples are mapped
to the same dedup key you'll see problems like this. In your case multiple
tuples can be mapped to the same value of "id + id1". For example, all tuples
with (id, id1) being any of these pairs will all map to a value
1 - 100 of 101 matches
Mail list logo