I think just any Dataset is not useful. The data should be close to the real data that you want to process. Similarly, the processing should be the same as you plan.
> On 28. Sep 2017, at 18:04, Gaurav1809 <[email protected]> wrote: > > Hi All, > > I have setup multi node spark cluster and now looking for good volume of > data to test and see how it works while processing the same. > Can anyone provide pointers as to where can i get few GBs of free sample > data? > > Thanks and regards, > Gaurav > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
