While that approach may result in mention objects with ids/names that were 
originally paired, we can't guarantee that without making an API lookup to 
twitter.

In general I’m in favor of streams maintaining and attempting to improve data 
accuracy, in this scenario it seems the inbound document has been degraded in a 
way that goes beyond the current scope and authority (API-wise) of the module 
to resolve, and given that I’m wary of setting the extension fields in a way 
that could potentially recombine fields incorrectly and thus make the problem 
worse.

So my vote would be create a separate object for id and name in every case, 
maintaining all of the original information and leave it to a downstream 
processor to improve the metadata if there is value to doing so.

Steve Blackmon
[email protected]



On Jun 30, 2014, at 9:50 AM, Robert Douglas <[email protected]> 
wrote:

> Hi all,
> 
> I’m currently working on cleaning up the implementation of the DataSift
> serializer and have come upon an issue. The data that we get back in a
> DataSift Interaction object contains two fields, mentions (which has all
> the handles for mentioned users) and mention_ids (which has all the Ids for
> mentioned users). Problem is, there is no guarantee that these two lists
> will be the same size. My current solution is to merge together the handles
> and Ids into individual UserMention objects whenever the mentions and
> mention_ids lists are the same size. In the event that those lists are not
> the same size, I create UserMention objects for every entry in both lists.
> 
> Does anyone have an different opinion on how this should be handled?
> 
> — Robert

Reply via email to