Has anyone tried finetuning (LoRA) on smaller LLMs for this task? On Sunday, April 9, 2023 at 5:36:16 AM UTC+5:30 [email protected] wrote:
> This version > <https://gist.github.com/reedlaw/5f84c801c20ad5f1e6b16c6ae193b4a7> uses > an OpenAI API key and transforms the CSV file adding a Category column. > > On Saturday, April 8, 2023 at 7:37:05 PM UTC-4 Reed Law wrote: > >> I got decent results with this script >> <https://gist.github.com/reedlaw/c0adb9dbc98fc843748365c8c2154a61>. It >> uses the LangChain Python library with LLaMA.cpp >> <https://github.com/ggerganov/llama.cpp> and the Alpaca LLM model. The >> results aren't as good as with the OpenAI API but it keeps everything local >> and private. >> >> On Tuesday, April 4, 2023 at 6:45:17 PM UTC-4 Reed Law wrote: >> >>> Good reminder about privacy. There is an open source chatbot, Vicuna >>> <https://github.com/lm-sys/FastChat>, which plans to release training >>> weights next week meaning you can use it privately and offline. >>> >>> On Sat, Apr 1, 2023 at 10:57 PM Martin Blais <[email protected]> wrote: >>> >>>> FYI for those who might be tempted to feed their entire ledger into the >>>> thing, keep in mind the privacy implications of doing this. >>>> The queries are very likely logged and even if used indirectly as >>>> training data for further updates, might leak your personal info to >>>> someone >>>> else generating transaction data. >>>> Just saying... it might not be obvious to some. >>>> >>>> >>>> >>>> On Sat, Apr 1, 2023 at 10:51 PM Red S <[email protected]> wrote: >>>> >>>>> Very neat! I've got to admit, I'm bowled over by the natural language >>>>> interface working so well. >>>>> >>>>> Amazon purchase classification is almost the only area of >>>>> classification where smart_importer doesn't do greatly well for me, since >>>>> there are few repeat purchases or keywords. >>>>> >>>>> Good to know about aichat too. >>>>> >>>>> That said, here are the results I tried to replicate with the same >>>>> data. Note the claim at the bottom: "You can use any account name you >>>>> like >>>>> as long as it doesn’t contain spaces," and its incongruency with >>>>> literally >>>>> every classification containing a space. One of the limitations of the >>>>> current generation is a lack of a confidence-in-results indicator, which >>>>> shows up here. >>>>> >>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An Instructional >>>>> Sketchbook" Assets:Zero-Sum-Accounts:Amazon-Purchases -24.71 USD >>>>> Expenses:Art Supplies 2022-09-14 * "Amazon.com" "Amazon Basics Clear >>>>> Sheet >>>>> Protectors for 3 Ring Binder, 8.5 x 11 Inch, 200-Pack" >>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -18.16 USD Expenses:Office >>>>> Supplies 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 >>>>> /3397011429 >>>>> Original Equipment Replacement- 12\" (Pack of 1)" >>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -11.6 USD Expenses:Auto Parts >>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric Cooktop, >>>>> Electric Hot Plate for Cooking, Electric Stove with Sensor Touch Control, >>>>> Portable Infrared Burner with T" >>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases >>>>> -69.43 USD Expenses:Kitchen Appliances 2023-01-26 * "Amazon.com" "PILOT >>>>> Parallel Caligraphy Pens, Assorted Point Sizes and Colors, 6 Count >>>>> (12722)" >>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -32.44 USD Expenses:Art >>>>> Supplies >>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer >>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible >>>>> with MacBook Air Pro, HP, Lenovo, D" >>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -14.29 USD Expenses:Laptop >>>>> Accessories >>>>> >>>>> In this example, I’ve added a category to each transaction by adding a >>>>> new line after the transaction line and indenting it by two spaces. The >>>>> category is specified as an account name and is preceded by a colon. You >>>>> can use any account name you like as long as it doesn’t contain spaces. >>>>> >>>>> On Saturday, April 1, 2023 at 7:51:20 AM UTC-7 [email protected] wrote: >>>>> >>>>>> I used a command-line tool, aichat >>>>>> <https://github.com/sigoden/aichat>, to pipe transactions to GPT-4 >>>>>> and categorize expenses. I gave it the following input (generated >>>>>> according >>>>>> to Booking Amazon Purchases >>>>>> <https://reds-rants.netlify.app/personal-finance/booking-amazon-purchases/> >>>>>> ): >>>>>> >>>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An >>>>>> Instructional Sketchbook" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -24.71 USD >>>>>> >>>>>> 2022-09-14 * "Amazon.com" "Amazon Basics Clear Sheet Protectors for 3 >>>>>> Ring Binder, 8.5 x 11 Inch, 200-Pack" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -18.16 USD >>>>>> >>>>>> 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 /3397011429 >>>>>> <(339)%20701-1429> Original Equipment Replacement- 12\" (Pack of 1)" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -11.6 USD >>>>>> >>>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric >>>>>> Cooktop, Electric Hot Plate for Cooking, Electric Stove with Sensor >>>>>> Touch >>>>>> Control, Portable Infrared Burner with T" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -69.43 USD >>>>>> >>>>>> 2023-01-26 * "Amazon.com" "PILOT Parallel Caligraphy Pens, Assorted >>>>>> Point Sizes and Colors, 6 Count (12722)" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -32.44 USD >>>>>> >>>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer >>>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible >>>>>> with MacBook Air Pro, HP, Lenovo, D" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -14.29 USD >>>>>> >>>>>> Using the following prompt and command: >>>>>> >>>>>> cat amz.beancount | aichat add categorized expense postings to >>>>>> balance each of the following transactions, preserving the included zero >>>>>> sum posting. do not use spaces in account names > out.beancount >>>>>> >>>>>> I got these results: >>>>>> >>>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An >>>>>> Instructional Sketchbook" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -24.71 USD >>>>>> Expenses:Art:Instructional-Sketchbook >>>>>> >>>>>> 2022-09-14 * "Amazon.com" "Amazon Basics Clear Sheet Protectors for 3 >>>>>> Ring Binder, 8.5 x 11 Inch, 200-Pack" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -18.16 USD >>>>>> Expenses:Office-Supplies:Sheet-Protectors >>>>>> >>>>>> 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 /3397011429 >>>>>> <(339)%20701-1429> Original Equipment Replacement- 12\" (Pack of 1)" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -11.6 USD >>>>>> Expenses:Car:Maintenance >>>>>> >>>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric >>>>>> Cooktop, Electric Hot Plate for Cooking, Electric Stove with Sensor >>>>>> Touch >>>>>> Control, Portable Infrared Burner with T" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -69.43 USD >>>>>> Expenses:Kitchen:Appliances >>>>>> >>>>>> 2023-01-26 * "Amazon.com" "PILOT Parallel Caligraphy Pens, Assorted >>>>>> Point Sizes and Colors, 6 Count (12722)" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -32.44 USD >>>>>> Expenses:Art:Calligraphy-Pens >>>>>> >>>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer >>>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible >>>>>> with MacBook Air Pro, HP, Lenovo, D" >>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -14.29 USD >>>>>> Expenses:Office-Supplies:Laptop-Stand >>>>>> >>>>>> There was some trial and error that resulted in the final prompt. At >>>>>> first, it removed the zero-sum postings and only output expenses. The >>>>>> expense accounts had spaces in their names (e.g. "Expenses:Office >>>>>> Supplies:Laptop Stand"). There is still room for improvement (I don't >>>>>> really care to have an account for laptop stands). Overall, it looks >>>>>> very >>>>>> helpful. Perhaps a good workflow is to use GPT to categorize a small >>>>>> batch >>>>>> of expenses, manually correct them, and then use smart_importer >>>>>> <https://github.com/beancount/smart_importer> to handle the rest. >>>>>> >>>>>> If anyone else has tips please share! >>>>>> >>>>>> >>>>>> On Saturday, April 1, 2023 at 1:53:46 AM UTC-4 Red S wrote: >>>>>> >>>>>>> Great idea. Talk about drudgery! >>>>>>> >>>>>>> On Friday, March 31, 2023 at 12:35:09 AM UTC-7 [email protected] >>>>>>> wrote: >>>>>>> >>>>>>> I haven't tried yet but my hope is that gpt would be great at >>>>>>> writing test code. Can you imagine? >>>>>>> >>>>>>> On Fri, Mar 31, 2023, 01:03 Red S <[email protected]> wrote: >>>>>>> >>>>>>> Fantastic! Hooray for reducing drudgery! >>>>>>> >>>>>>> It's apparently quite effective at using code from existing github >>>>>>> projects. I love the code explanations in natural language. >>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Beancount" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/beancount/b08ca160-539a-4d7a-ba9c-8540f214d2bcn%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/beancount/b08ca160-539a-4d7a-ba9c-8540f214d2bcn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>>> >>> You received this message because you are subscribed to a topic in the >>>> Google Groups "Beancount" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/beancount/_7udAMlfLAg/unsubscribe. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/beancount/CAK21%2BhOGB19cwNOcyBmvEbZAcwNCEnUuN-k-io3ifbkZe3vheA%40mail.gmail.com >>>> >>>> <https://groups.google.com/d/msgid/beancount/CAK21%2BhOGB19cwNOcyBmvEbZAcwNCEnUuN-k-io3ifbkZe3vheA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/13eade17-f015-4bcb-a835-1fee19a6f4een%40googlegroups.com.
