Does anyone have any great suggestions for open datasets to run/test SGD on that are in the 500MB - 1GB range?
Just looking for nice benchmarking datasets, wondered what the community thought here. Thanks, Josh -- Twitter: @jpatanooga Principal Solution Architect @ Cloudera hadoop: http://www.cloudera.com
