Just looking for thoughts on this. I am consuming the gardenhose via a php app on my web server. So far so good. The script simply creates a new file every X amount of time and starts feeding the stream into it so I get a continuous stream of fresh data and I can delete old data via cron. I plan to access the stream (files) with separate processes for further json parsing and data mining.
But then that got me to thinking about simply feeding the data into a MySQL database for easier data manipulation and indexing. Would that cause a more stressful server load with the constant INSERT queries vs a process just dumping the data into a file [ via PHP fputs() ] that is perpetually open? What about simply running the php process and accessing the "stream" directly? Only grabbing a snapshot of the data when a process needs it? I'm not really concerned with historical data as my web based app is more focused on trends at a given moment. Just wondering out loud if simply letting the process run in the background grabbing data would eventually fill up any caches or system memory.