I came across this a few weeks ago. II a nutshell  you can use it for
generating test data and other scenarios where you need
realistic-looking but not necessarily real data. With so many
regulations and copyrights etc it is a viable alternative. I used it
to generate 1000 lines of mixed true and fraudulent transactions to
build a machine learning model to detect fraudulent transactions using
PySpark's MLlib library. You can install it via pip install Faker

Details from

https://github.com/joke2k/faker

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to