On Fri, Oct 4, 2013 at 8:57 PM, Paul Snively <[email protected]> wrote:
> Hi Prashant! > > That's very interesting. So this appears to be a general issue with >> ActorStream, then? >> >> With all DStream's I suppose. > > > OK, so that seems like a bug independent of the issue around having no > ActorSystem in scope for runJob(), then. > > So, it is not a bug as Matei already pointed out. > I have not spent enough time on spray-io yet, but if it is implemented as > ActorExtension, then using actorStream is the best fit. For example I have > used akka-camel/akka-zeromq to receive streams from variety of endpoints. I > am not suggesting you should use it, although it has SSL support etc. That > was the original motivation while designing it. But if it can be somehow > used in a different and more useful way, I had encourage it. > > > I'm reluctant to add another abstraction layer here, since the motivation > for using spray-io in the first place is its extreme scalability, and we > mostly want to rely on Spark Streaming here for managing availability and > recovery. To me, it seems like we've identified two obvious issues: one > revolving around foreach ... foreach ... over a DStream not working > correctly, the other revolving around runJob() not having an ActorSystem in > scope. While I understand that there is much other work to do, both of > these seem, first of all, rather serious, and secondly, as if they should > have relatively easy fixes. Of course, I'm open to being corrected in this > belief. :-) > > I had like to understand a bit further, as to what are your ultimate > intention in a bit more detail on architecture from a higher level. > Probably I am hinting on a redesign. > > > That's certainly a valid option! > > Without discussing things I'm not at liberty to, all I can really say is > that I'm trying to build a real-time streaming server that takes a > connection from a client that expects to be able to then send arbitrary > bytes to the server. On the server, I want a DStream per socket connection > to process the incoming data. There's other stuff, such as this needing to > support TLS mutual authentication, but I'll get there when I have data > flowing. :-) > > It's really quite simple, or at least it should be, and again, I believe > it will be, when the bugs we've identified are fixed. > > In your case, the code is still complicated to spot where exactly we are > leaking an ActorRef in the test case ? > > > If it helps, the only ActorRef in my code now is in the SpraySslClient, > i.e. nowhere that's being serialized. > > or in our Handler. > > > I have to respectfully suggest it is indeed this. OTOH, is it really a > "leak?" I am, in fact, closing over objects that, sooner or later, > construct ActorRefs. The whole point of my task is to do something with > data coming from an ActorRef that represents a TCP connection, then send > the result back to that same ActorRef. I guess I'm having a little trouble > understanding how many actorStream use cases there are that wouldn't > involve ultimately closing over at least one ActorRef. :-) > > It is also possible to start multiple actorStream receiver in advance and > then using their supervisor to start more actor as receiver for you, for > example on each connection (if that makes any sense, take a look at it). > > > I'm afraid I didn't quite follow that, but it sounds interesting! > > At this point, though, I have to suggest that the path forward, simply for > Spark Streaming to work correctly out of the box, is clear. I have created > an issue against the missing ActorSystem in runJob() in JIRA, but it sounds > like we need another one for the foreach ... foreach ... issue. I'm afraid > I don't know exactly how to characterize that issue. Could I possibly > impose upon you to create that ticket? > > BTW !, Did you consider doing above ? It can make some sense from the way spark streaming is designed. IMO Spawning a new dstream on each connection doesn't seem good architectural choice, rather a new actor is. I might be wrong. Thanks again for digging into tis! > Paul > > -- s
