> > It's a gap in experience, Thomas.
Most probably you should read some good books on data extraction and then choose your tools accordingly. I never think that BSP is and will be a good extraction technique for unstructured data. But these are just my two cents here- there seems to be somewhat more political problems in this game than using tools appropriately. 2012/12/10 Thomas Jungblut <[email protected]> > Yes, if you preprocess your data correctly. > I have done the same unstructured extraction with the movie database from > IMDB and it worked fine. > That's just not a job for BSP, but for MapReduce. > > 2012/12/10 Edward J. Yoon <[email protected]> > >> It's a gap in experience, Thomas. Do you think you can extract Twitter >> >> mention graph using parseVertex? >> >> On Tue, Dec 11, 2012 at 4:34 AM, Thomas Jungblut >> <[email protected]> wrote: >> > I have trouble understanding you here. >> > >> > How can I generate large sample without coding? >> > >> > >> > Do you mean random data generation or real-life data? >> > Personally I think it is really convenient to transform unstructured >> data >> > in a text file to vertices. >> > >> > >> > 2012/12/10 Edward <[email protected]> >> > >> >> I mean, With or without input reader. How can I generate large sample >> >> without coding? >> >> >> >> It's unnecessary feature. As I mentioned before, only good for simple >> and >> >> small test. >> >> >> >> Sent from my iPhone >> >> >> >> On Dec 11, 2012, at 3:38 AM, Thomas Jungblut < >> [email protected]> >> >> wrote: >> >> >> >> >> >> >> >> In my case, generating test data is very annoying. >> >> > >> >> > >> >> > Really? What is so difficult to generate tab separated text data?;) >> >> > I think we shouldn't do this, but there seems to be very little >> interest >> >> in >> >> > the community so I will not block your work on it. >> >> > >> >> > Good luck ;) >> >> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon >> > >
