[GitHub] storm pull request: [STORM-855] Add tuple batching

2016-03-31 Thread mjsax
Github user mjsax closed the pull request at: https://github.com/apache/storm/pull/694 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] storm pull request: [STORM-855] Add tuple batching

2016-03-31 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-203830224 Closing this (also closing the JIRA) as Bobby's work (https://github.com/apache/storm/pull/765) got merged and I don't have time to work on this right now. --- If your p

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-10-05 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-145539778 Thx. Take your time; there is actually no rush. I was just curious :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-10-05 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-145534243 I still need to run some tests. I am way behind on my open source commitments. I will try really hard this week to play around with this and let you know the results.

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-10-05 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-145473220 Hi, what is the next step? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-29 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-144211924 One question: There are a lot of changes in `storm-core/src/jvm/backtype/storm/generated/*` resulting from rebuild those files with `genthrift.sh`. However, it seems to me

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-29 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-144203098 I just pushed a new version (rebased to current master and squahed commits etc). The PR is in much cleaner state now. Looking forward to your feedback. Batching doe

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-29 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-144071648 I see. It's a github issue... Usually I rebase before updating a PR. This time I did not... Thanks for the quick response. --- If your project is set up for it, you can re

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-29 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-144066244 @mjsax the issue is with how you upmerge. If you just do a git merge or a git pull github can become confused because it thinks you are still based off of the original c

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-29 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-144060581 I just realized, that some commits from other people got added to this PR. This confuses me. Can you guide me through the process Storm development is following here? I am

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-24 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-142980140 Ok. Hope to get it done over the weekend... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-24 Thread harshach
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-142977482 @mjsax Thanks for the quick reply. It would be great to ack issue fixed than others can also run some tests. --- If your project is set up for it, you can reply to thi

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-24 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-142977052 I did not have time yet. However, I am still on it... I did already some more work for this PR that is not pushed yet (for example to support different output batches for d

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-09-24 Thread harshach
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-142975644 @mjsax did you get a chance to look at the ack tuples issue. This is going to be great perf improvement would like to see this merged in. --- If your project is set up

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-27 Thread revans2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-135608325 acking does seem to be having some issues. The ack tuples don't seem to be batched, so the serialization code gets confused. --- If your project is set up for it, you c

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-26 Thread knusbaum
Github user knusbaum commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r38002218 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -18,23 +18,23 @@ (:import [backtype.storm.generated Grouping] [ja

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-25 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134682336 Here are some additional benchmark results with larger `--messageSize` (ie, 100 and 250). Those benchmarks are run in a 12 node cluster (with `nimbus.thrift.max_buffer_size

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-25 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134656876 I just checked some older benchmark result doing batching in user land, ie, on top of Storm (=> Aeolus). For this case, a batch size of 100 increased the spout output rate

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134468397 Excellent work. This looks very promising. I'll address your questions tomorrow. --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134450197 I just pushed some changes (I added a new commit, so you can better see what I changed): - added type hints - split tuple and batch serialization in separate class

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134412604 @mjsax, I'm not sure it's not related. Here is the output from my benchmarks: Pre-batching (apache master) ``` status timetime-diff ms

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread mjsax
Github user mjsax commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37807820 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -658,40 +688,42 @@ ;;(log-de

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37807094 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -658,40 +688,42 @@ ;;(log

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134373051 Keep an eye out for more missing types. Those are often a source of unexpectedly bad performance. I'm hesitant to add another thread, especially since we're operatin

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134371947 Type hinting improves by 10% :) But 30% is still a huge gap. About batches that don't fill up. We need to introduce a timeout (and a flushing thread or other flushing st

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread mjsax
Github user mjsax commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37800211 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -445,6 +445,32 @@ ret )) +(defn init-batching-buffer [worker-co

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37785236 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -445,6 +445,32 @@ ret )) +(defn init-batching-buffer [worker

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134326789 What happens when Batch.capacity - 1 tuples are emitted and then there's a long pause in the input? --- If your project is set up for it, you can reply to this email an

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37780767 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -445,6 +445,32 @@ ret )) +(defn init-batching-buffer [worker

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread knusbaum
Github user knusbaum commented on a diff in the pull request: https://github.com/apache/storm/pull/694#discussion_r37780743 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -445,6 +445,32 @@ ret )) +(defn init-batching-buffer [worker

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-24 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-134315867 Hi, I did a few performance tests and unfortunately, the impact of my changes is quite high. Not using batching in my branch (compared to master) reduces throughput by 40%.

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133200328 Sure, please take your time! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133198490 Thanks for your guidance! I will have a look at #521 and add some results. May take a few days... --- If your project is set up for it, you can reply to this email and hav

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133197501 No, there's no specific approach. You can refer @d2r's approach, https://github.com/apache/storm/pull/521#issuecomment-106038982 I think we would be inte

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133194835 Sure. Is there any specific approach I should take? How to add/report those result? I did a simple test already, using `ExclamationTopology` example. Setting batch size to

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133194709 FYI, you can use https://github.com/yahoo/storm-perf-test for benchmarking if you don't have your own. You need to modify pom.xml to let it points recent version o

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133193429 @mjsax I'm not familiar with clojure, too. Many committers will take a look, so don't worry. :) Before taking a look, I think you're encouraged to do ben

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133187814 Great! Just a heads up: This is my first PR for Storm and I just started to learn Clojure. Please review very carefully. Looking forward to your feedback. --- If your

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133187367 asfbot recognizes your change and links. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] storm pull request: [STORM-855] Add tuple batching

2015-08-20 Thread mjsax
Github user mjsax commented on the pull request: https://github.com/apache/storm/pull/694#issuecomment-133186983 I guess I need to close an re-open this PR to get the linkage to JIRA... --- If your project is set up for it, you can reply to this email and have your reply appear on Git