Re: OOMs with Dropbox Camel module

2016-11-17 Thread Edoardo Causarano
Hi there, I've done some work on this but I'd like to know a couple things
now:

Should I branch from master and you take care of cherry picking?

What language level should I use? Can I go with Java8?

Does it need to be 100% backwards compatible with the existing component?
Some of the headers in the current impl can be improved IMHO but current
usage would break if I changed them arbitrarily... is there a deprecation
mechanism for headers?

Advice on testing: Dropbox doesn't provide a testing module, is there a
robust way to create a test harness? Some sort of record/replay tool
perhaps?


Best,
Edoardo

On Thu, 10 Nov 2016 at 14:56, Claus Ibsen  wrote:

> Hi
>
> Yeah you can take a look at how some of the dataformats does that as
> they have out of bands streaming support with Camel's stream caching
> http://camel.apache.org/stream-caching.html
>
> We love contributions, so you are very welcome to look into this and
> provide a PR / patch. And to log a JIRA ticket.
> http://camel.apache.org/contributing
>
> On Thu, Nov 10, 2016 at 11:50 AM, Edoardo Causarano
>  wrote:
> > Hi all,
> >
> > just moved code from dev to testing and found that in real-life the
> DropBox component blows apart in OOMs. Seems that it’s using plain BAOS
> (sic) to buffer remote data which is not a particularly good idea when you
> have no idea how big the files will be (or you’re pretty certain they’re in
> the multiple GB range.)
> >
> > I’d rather not implement a client from scratch so I’d appreciate some
> guidance on how to fix the existing camel-dropbox component to use
> stream-caching.
> >
> >
> > Best,
> > Edoardo
>
>
>
> --
> Claus Ibsen
> -
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>


OOMs with Dropbox Camel module

2016-11-10 Thread Edoardo Causarano
Hi all,

just moved code from dev to testing and found that in real-life the DropBox 
component blows apart in OOMs. Seems that it’s using plain BAOS (sic) to buffer 
remote data which is not a particularly good idea when you have no idea how big 
the files will be (or you’re pretty certain they’re in the multiple GB range.)

I’d rather not implement a client from scratch so I’d appreciate some guidance 
on how to fix the existing camel-dropbox component to use stream-caching.


Best,
Edoardo

Re: Question on multicast to pipelines

2016-11-03 Thread Edoardo Causarano
HI,

I created a jira (CAMEL-10442) for this in case it’s a bug. Thanks for your 
help.


Best,
Edoado
  
> On 1 Nov 2016, at 19:30, Brad Johnson  wrote:
> 
> Edoardo,
> 
> I missed the second set of logging statements, sorry.  I thought you'd said
> they were outputting the same thing. And, yes, that's a bit
> counter-intuitive. Personally I almost always do it with just to routes in
> the mulitcast and then any further routing or changes I put in those. And I
> use blueprint for this though I'm slowly switching the Java DSL.
> 
>   
>
>
>
>
>
>
> 
> 
>
>///modify the body here
> 
>   
> 
>   
>
>  ///log message here
>   
> 
> On Tue, Nov 1, 2016 at 12:27 PM, DariusX  wrote:
> 
>> Ah, I didn't understand what you were saying before.
>> Your point is that these two should be synonymous, even if they're enclosed
>> in a multicast()...end():
>> 
>> 1) This...
>> .*pipeline("direct:A", "direct:B")*
>> 
>> 2) and this...
>> .*pipeline().to("direct:C").to("direct:D").end()*
>> 
>> but only the second one works the way you expect.
>> The first one sends the same input to both routes (i.e. does not send the
>> output of A to B)
>> 
>> 
>> It does seem odd, and a brief test confirms the behavior you describe (
>> Sample code here
>> > CamelSandbox/src/main/java/com/zerses/camelsandbox/
>> MulticastPipelinesTest.java)>
>> )
>> 
>> Looking at  the Camel code - Line 1165
>> > src/main/java/org/apache/camel/model/ProcessorDefinition.java>
>> , that version of pipeline(String...uri) is simply a synonym for
>> to(String...uri), which would explain how it is working.
>> 
>> Someone else would need to speak to how it *ought* to work, but I agree it
>> does not seem intuitive as-is.
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> View this message in context: http://camel.465427.n5.nabble.
>> com/Question-on-multicast-to-pipelines-tp5789396p5789593.html
>> Sent from the Camel - Users mailing list archive at Nabble.com.
>> 



Re: Question on multicast to pipelines

2016-11-01 Thread Edoardo Causarano
I don’t see how .pipeline().to(“A”).to(“B”).end() should not be equivalent to 
.pipeline(“A”, “B”), or to .to(“A”, “B”) which is - for what I understood from 
documentation - equivalent to the pipeline statement anyway. I am of course 
changing data in the A and C steps, please go look at the gist in my original 
email (and the output of the working example for that matter.)

I understand the intuition that the multicast sends the same message 
downstream; what is counter-intuitive to me is that it reaches *into* the 
pipelines rather than just their heads. Please read my code and the logs I 
added in my last post and let me know if this is expected - albeit backwards - 
behavior or whether it’s a bug.


Best,
Edoardo


> On 1 Nov 2016, at 14:59, Brad Johnson  wrote:
> 
> That's what you should see unless you change the data in A or in C.  A and
> C should both receive START.  It is a multicast. If you change the value in
> A you'll see that change in B but that will not be shown in C since C is at
> the root of the multicast. If you change the data in C you'll see it in D.
> 
> Another way to think of the multicast, if this helps, is that the first
> elements in the multicast are like a pub/sub or JMS topic where each of the
> subscribers receive exactly the same message.
> 
> On Tue, Nov 1, 2016 at 4:49 AM, Edoardo Causarano <
> edoardo.causar...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> these are the results I get, only the most explicit and verbose
>> configuration returns the expected result.
>> 
>> Working route:
>> .pipeline().to("A").to("B").end()
>> .pipeline().to("C").to("D").end()
>> 
>> 10:41:12.644 [main] INFO route1 - after direct:start body=START
>> 10:41:12.666 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> A Exchange[ID-Spitfire-local-50181-1477993271722-0-3]
>> 10:41:12.667 [main] DEBUG com.esc.test.MulticastPipelinesTest - A got
>> in=START
>> 10:41:12.667 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> B Exchange[ID-Spitfire-local-50181-1477993271722-0-3]
>> 10:41:12.667 [main] DEBUG com.esc.test.MulticastPipelinesTest - B got in=A
>> 10:41:12.670 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> C Exchange[ID-Spitfire-local-50181-1477993271722-0-4]
>> 10:41:12.671 [main] DEBUG com.esc.test.MulticastPipelinesTest - C got
>> in=START
>> 10:41:12.671 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> D Exchange[ID-Spitfire-local-50181-1477993271722-0-4]
>> 10:41:12.671 [main] DEBUG com.esc.test.MulticastPipelinesTest - D got in=C
>> 
>> 
>> Faulty routes:
>> .pipeline("A", "B")
>> .pipeline("C", "D”)
>> 
>> or
>> 
>> .to("A", "B")
>> .to("C", "D")
>> 
>> 10:43:46.383 [main] INFO route1 - after direct:start body=START
>> 10:43:46.389 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> A Exchange[ID-Spitfire-local-50316-1477993425625-0-3]
>> 10:43:46.389 [main] DEBUG com.esc.test.MulticastPipelinesTest - A got
>> in=START
>> 10:43:46.390 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> B Exchange[ID-Spitfire-local-50316-1477993425625-0-4]
>> 10:43:46.390 [main] DEBUG com.esc.test.MulticastPipelinesTest - B got
>> in=START
>> 10:43:46.391 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> C Exchange[ID-Spitfire-local-50316-1477993425625-0-5]
>> 10:43:46.391 [main] DEBUG com.esc.test.MulticastPipelinesTest - C got
>> in=START
>> 10:43:46.391 [main] DEBUG org.apache.camel.processor.SendProcessor - >>>>
>> D Exchange[ID-Spitfire-local-50316-1477993425625-0-6]
>> 10:43:46.391 [main] DEBUG com.esc.test.MulticastPipelinesTest - D got
>> in=START
>> 
>> 
>> Best,
>> Edoardo
>> 
>>> On 31 Oct 2016, at 15:04, DariusX  wrote:
>>> 
>>> Your example was:
>>> multicast()
>>>   .pipeline("A", "B")
>>>   .pipeline("C", "D")
>>> .end()
>>> 
>>> You send "START" as the body to this. So, you should expect "START" to be
>>> the in.body for both "A" and "C".
>>> 
>>> The in body for "B" will depend on what "A" does. Example: if "A"
>> transforms
>>> the body to a constant "Hello from A", then that is what "B" will get as
>> its
>>> in.body.
>>> 
>>> Similarly, "D" will get whatever "C" decides to send along.
>>> 
>>> If neither A nor C make any changes to the body, then you should expect
>>> "START" to be the in.body for all four.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://camel.465427.n5.nabble.
>> com/Question-on-multicast-to-pipelines-tp5789396p5789518.html
>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>> 
>> 



Re: Question on multicast to pipelines

2016-11-01 Thread Edoardo Causarano
Hi all,

these are the results I get, only the most explicit and verbose configuration 
returns the expected result. 

Working route: 
.pipeline().to("A").to("B").end()
.pipeline().to("C").to("D").end()

10:41:12.644 [main] INFO route1 - after direct:start body=START
10:41:12.666 [main] DEBUG org.apache.camel.processor.SendProcessor -  A 
Exchange[ID-Spitfire-local-50181-1477993271722-0-3]
10:41:12.667 [main] DEBUG com.esc.test.MulticastPipelinesTest - A got in=START
10:41:12.667 [main] DEBUG org.apache.camel.processor.SendProcessor -  B 
Exchange[ID-Spitfire-local-50181-1477993271722-0-3]
10:41:12.667 [main] DEBUG com.esc.test.MulticastPipelinesTest - B got in=A
10:41:12.670 [main] DEBUG org.apache.camel.processor.SendProcessor -  C 
Exchange[ID-Spitfire-local-50181-1477993271722-0-4]
10:41:12.671 [main] DEBUG com.esc.test.MulticastPipelinesTest - C got in=START
10:41:12.671 [main] DEBUG org.apache.camel.processor.SendProcessor -  D 
Exchange[ID-Spitfire-local-50181-1477993271722-0-4]
10:41:12.671 [main] DEBUG com.esc.test.MulticastPipelinesTest - D got in=C


Faulty routes:
.pipeline("A", "B")
.pipeline("C", "D”)

or 

.to("A", "B")
.to("C", "D")

10:43:46.383 [main] INFO route1 - after direct:start body=START
10:43:46.389 [main] DEBUG org.apache.camel.processor.SendProcessor -  A 
Exchange[ID-Spitfire-local-50316-1477993425625-0-3]
10:43:46.389 [main] DEBUG com.esc.test.MulticastPipelinesTest - A got in=START
10:43:46.390 [main] DEBUG org.apache.camel.processor.SendProcessor -  B 
Exchange[ID-Spitfire-local-50316-1477993425625-0-4]
10:43:46.390 [main] DEBUG com.esc.test.MulticastPipelinesTest - B got in=START
10:43:46.391 [main] DEBUG org.apache.camel.processor.SendProcessor -  C 
Exchange[ID-Spitfire-local-50316-1477993425625-0-5]
10:43:46.391 [main] DEBUG com.esc.test.MulticastPipelinesTest - C got in=START
10:43:46.391 [main] DEBUG org.apache.camel.processor.SendProcessor -  D 
Exchange[ID-Spitfire-local-50316-1477993425625-0-6]
10:43:46.391 [main] DEBUG com.esc.test.MulticastPipelinesTest - D got in=START


Best,
Edoardo

> On 31 Oct 2016, at 15:04, DariusX  wrote:
> 
> Your example was:
> multicast() 
>.pipeline("A", "B") 
>.pipeline("C", "D") 
> .end() 
> 
> You send "START" as the body to this. So, you should expect "START" to be
> the in.body for both "A" and "C".
> 
> The in body for "B" will depend on what "A" does. Example: if "A" transforms
> the body to a constant "Hello from A", then that is what "B" will get as its
> in.body.
> 
> Similarly, "D" will get whatever "C" decides to send along.
> 
> If neither A nor C make any changes to the body, then you should expect
> "START" to be the in.body for all four.
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://camel.465427.n5.nabble.com/Question-on-multicast-to-pipelines-tp5789396p5789518.html
> Sent from the Camel - Users mailing list archive at Nabble.com.



Re: Question on multicast to pipelines

2016-10-28 Thread Edoardo Causarano
Hi Brad,

yes that's the definition of multicast but the documentation also suggests
that pipelines are supposed to be implicitly derived from a vararg to(...)
statement.

In any case I don't understand how the multicast could leak into a vararg
pipeline(...). It really feels like a bug to me, but I know little enough
of Camel to assume there's probably a good reason (in which case I'm happy
to understand it's logic)


Re: Question on multicast to pipelines

2016-10-28 Thread Edoardo Causarano
Hi Darius, that's correct, I'm expecting the START message to enter the
pipelines from their heads.

On Fri, 28 Oct 2016 at 20:43, DariusX  wrote:

> It isn't clear what you want as the expected output.
> From your example, it seems that you have two "pipelines" and you want to
> multicast to send the message down both pipelines.
> But, that's probably not what you really want.
>
>
>
> --
> View this message in context:
> http://camel.465427.n5.nabble.com/Question-on-multicast-to-pipelines-tp5789396p5789416.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>


Question on multicast to pipelines

2016-10-28 Thread Edoardo Causarano
Hi all,

I had some trouble figuring out how to multicast to some pipelines. I 
eventually found a [working] definition, but I also expected other forms to 
work such as:

.multicast().aggregationStrategy(AggregationStrategies.groupedExchange())
.pipeline("A", "B")
.pipeline("C", "D")
.end()

or 

.multicast().aggregationStrategy(AggregationStrategies.groupedExchange())
.to("A", "B")
.to("C", "D")
.end()

yet they all fail (in the sense that they all receive the original START 
incoming payload. Can anyone explain how this is expected behavior? 


Best,
Edoardo


[working]: https://gist.github.com/ecausarano/4b66294464741b9f626890b29ea0aec2





Re: Losing mi sanity on pipeline testing

2016-10-28 Thread Edoardo Causarano
Hi Quinn,

Thanks that's correct, but really I was looking for a way to test the
construct timer+broadcast+pipelines+merged_processor.

Shoving the datasets to fake the remotes inside the pipelines made sense.

Anyway, I'll see if I have time to get back to this setup sometime later.
Right now I'm working along with a different kludge :/


Edoardo


Re: Losing mi sanity on pipeline testing

2016-10-27 Thread Edoardo Causarano
Hi,

answers inline:

> On 27 Oct 2016, at 14:50, Brad Johnson  wrote:
> 
> Try putting in .log(${body}) in between each of the lines and see what you
> get. It appears you are using the SimpleDataSet which takes a single
> String.  What behavior are you expecting from ti?  Is it throwing an
> exception?  Have you tried sending in a List with a ListDataSet to
> see what behavior it gives you?

I’m expecting the SimpleDataSet to provide 1 message body consisting of a 
List (a list of items from a DropBox query.)
It’s throwing when trying to assert that the body correspond to the data 
provided by the direct producer template (I only figured that out after step 
debugging, is this the intended behavior?) 

> Is there a specific reason you are using a .pipeline() call explicitly?  I
> believe that's the default behavior.  The only reason I can think of
> explicitly using it is when you just want to use it as a short hand for
> routing as shown in the Camel documentation. You don't have to chain a
> bunch of to() method calls then.

Because I’d like to broadcast a timer trigger to a couple pipelines consisting 
of remote API getters and transformers. Otherwise all stages - including the 
transformers - would be directly wired to the timer and that would make no 
sense. 

timer -->  pipeline(dropbox, transformer) --> doStuff
 |->  pipeline(google, transformer)   --|   


Best,
Edoardo
 
> from("direct:a").pipeline("direct:x", "direct:y", "direct:z",
> "mock:result");
> 
> On Thu, Oct 27, 2016 at 7:03 AM, Edoardo Causarano <
> edoardo.causar...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I’m trying to do something relatively simple, yet it’s been eluding me for
>> hours already!
>> 
>> I’d like to have a timer trigger a broadcast to a couple pipelines to
>> invoke a remote cloud component, transform the responses to a common type
>> and merge the results in one homogenous list to further process.
>> 
>> It’s been a nightmare so far and even testing is proving to be nigh
>> impossible, even this simple setup is failing with a :
>> 
>> @Override
>> protected RouteBuilder createRouteBuilder() {
>>return new RouteBuilder() {
>>@Override
>>public void configure() throws Exception {
>>from("direct:start").pipeline()
>>.to("dataset:source")
>>.to("dropBoxTranslator")
>>.end()
>>.to("mock:result");
>>}
>>};
>> }
>> 
>> Seems that the dataset endpoint tries to assert that its’ own output equal
>> the dummy string I sent from the direct:start producer to execute an
>> exchange (in 
>> org.apache.camel.component.dataset.DataSetSupport#assertMessageExpected).
>> What on earth is going on?!
>> 
>> 
>> Best,
>> Edoardo



Losing mi sanity on pipeline testing

2016-10-27 Thread Edoardo Causarano
Hi all,

I’m trying to do something relatively simple, yet it’s been eluding me for 
hours already!

I’d like to have a timer trigger a broadcast to a couple pipelines to invoke a 
remote cloud component, transform the responses to a common type and merge the 
results in one homogenous list to further process.

It’s been a nightmare so far and even testing is proving to be nigh impossible, 
even this simple setup is failing with a :

@Override
protected RouteBuilder createRouteBuilder() {
return new RouteBuilder() {
@Override
public void configure() throws Exception {
from("direct:start").pipeline()
.to("dataset:source")
.to("dropBoxTranslator")
.end()
.to("mock:result");
}
};
}

Seems that the dataset endpoint tries to assert that its’ own output equal the 
dummy string I sent from the direct:start producer to execute an exchange (in 
org.apache.camel.component.dataset.DataSetSupport#assertMessageExpected). What 
on earth is going on?!


Best,
Edoardo