I would say you have to convert from UNIX timestamp to YYYY-MM-DD HH:MM:SS format but I don’t know what would be the best way to do that. What I can say is that you will need to add those columns to the Data Frame before trying to train the model or score, that means your new Data Frame should go before line 44 in file FlowSuspiciousConnectsAnalysis.scala. That way, inputFlowRecords will be transformed into something new with all the date/time columns and then pass that new data frame to line 44 instead of passing inputFlowRecords.
Let me know if that works. On 3/13/17, 12:13 PM, "Giacomo Bernardi" <[email protected]> wrote: Thanks so much Ricardo. I'm doing the remapping as you suggested. However, for a handful of fields things are going to be more complicated: 1) I see a "treceived" field, what does it represent? Is it a timestamp? 2) Spot uses multiple fields for the date (tryear, trmonth, trday, trhour, trminute, trsec), while in my data I have a UNIX timestamp. Can you suggest what's the best way to split it up? I walked through the input data handler code but I couldn't figure out a semi-trivial way to do it. Again thanks. Giacomo
