thisisnic commented on issue #57: URL: https://github.com/apache/arrow-cookbook/issues/57#issuecomment-919170497
Another thing to think about here is dataset requirements. I'm currently using some really compact datasets which are created inline so the reader can see their exact contents, and are each around 3 lines long. These won't work for every recipe of course, but the advantage of them is that they don't require the reader to load in data, allow the reader to copy and paste all of the code, and are very easy to reason about. Here are a few: ## Oscars | actor | awards | |----|----| |"Katharine Hepburn"|4| "Meryl Streep"|3| "Jack Nicholson"|3| ## Shares |company|price|date| |---|---|---| "AMZN"|3463.12|2021-09-02| "GOOG"|2884.38|2021-09-02| "BKNG"|2300.46|2021-09-02| "TSLA"|732.39|2021-09-02| In creating datasets, I've tried to come up with topics that would be familiar to most people, are vaguely interesting, and where necessary, contain a few different data types. An example of a datasets that I've used but would like to replace with something more compelling: |group|score| |---|---| |"A"|99| "B"|97| "C"|99| If expanded versions of the first 2 datasets would be of use to anyone else, let me know and I can try to create something. Alternatively, it'd be good to hear your requirements/ideas and see your existing datasets that could be useful to share. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
