We have a similar setup, and based on routing needs we plan to pass the original JSON, plus some extra fields that will simplify routing via fieldsGrouping. e.g. spout -> content (JSONObject) bolt1 -> receives all spout tuples; execute() uses JSONObject -> emit(JSONObject, String, String) (String values are parsed out of JSONObject) bolt2 -> receives bolt1 tuples based on fieldsGrouping; execute uses JSONObject, String, String to perform some operation -> emit(JSONObject, String) (String values are based on some logic)
So while you could go either extreme (continually pass JSONObject value as tuple, or parse the JSONObject and pass only decomposed values as tuple), you can also do both, which would allow you to use fieldGrouping, in case that is important (it is important for our case). Tyson On Mar 29, 2014, at 1:44 PM, Software Dev <[email protected]> wrote: > We actually have a spout that emits just 1 JSON string per tuple. > Wondering what should be down downstream after we have the JSON string > > On Sat, Mar 29, 2014 at 12:20 PM, Andrew Neilson <[email protected]> wrote: >> my team's project has successfully used Jackson >> (https://github.com/FasterXML/jackson) to deserialize a spout of JSON arrays >> into tuples, and I can recommend it. Though I'll warn you that it takes a >> little bit of work beyond the most basic usage (i.e. mapper.readValue(json, >> List.class)) to avoid dealing with type ambiguity. >> >> >> On Sat, Mar 29, 2014 at 12:06 PM, Software Dev <[email protected]> >> wrote: >>> >>> Say we are receiving tuples of JSON from a spout. Should we just keep >>> passing around the JSON string and deserialize it in each bolt or Is >>> it best to break apart the JSON object into a bunch of fields that can >>> be passed around. >>> >>> I'm thinking in terms of performance the latter may be "better" >>> although it will slightly make the rest of the topology more complex. >>> >>> Also, what is a good JSON library to work with? >>> >>> Thanks >> >>
