Has anyone tried finetuning (LoRA) on smaller LLMs for this task? 

On Sunday, April 9, 2023 at 5:36:16 AM UTC+5:30 [email protected] wrote:

> This version 
> <https://gist.github.com/reedlaw/5f84c801c20ad5f1e6b16c6ae193b4a7> uses 
> an OpenAI API key and transforms the CSV file adding a Category column.
>
> On Saturday, April 8, 2023 at 7:37:05 PM UTC-4 Reed Law wrote:
>
>> I got decent results with this script 
>> <https://gist.github.com/reedlaw/c0adb9dbc98fc843748365c8c2154a61>. It 
>> uses the LangChain Python library with LLaMA.cpp 
>> <https://github.com/ggerganov/llama.cpp> and the Alpaca LLM model. The 
>> results aren't as good as with the OpenAI API but it keeps everything local 
>> and private.
>>  
>> On Tuesday, April 4, 2023 at 6:45:17 PM UTC-4 Reed Law wrote:
>>
>>> Good reminder about privacy. There is an open source chatbot, Vicuna 
>>> <https://github.com/lm-sys/FastChat>, which plans to release training 
>>> weights next week meaning you can use it privately and offline.
>>>
>>> On Sat, Apr 1, 2023 at 10:57 PM Martin Blais <[email protected]> wrote:
>>>
>>>> FYI for those who might be tempted to feed their entire ledger into the 
>>>> thing, keep in mind the privacy implications of doing this.
>>>> The queries are very likely logged and even if used indirectly as 
>>>> training data for further updates, might leak your personal info to 
>>>> someone 
>>>> else generating transaction data. 
>>>> Just saying... it might not be obvious to some.
>>>>
>>>>
>>>>
>>>> On Sat, Apr 1, 2023 at 10:51 PM Red S <[email protected]> wrote:
>>>>
>>>>> Very neat! I've got to admit, I'm bowled over by the natural language 
>>>>> interface working so well.
>>>>>
>>>>> Amazon purchase classification is almost the only area of 
>>>>> classification where smart_importer doesn't do greatly well for me, since 
>>>>> there are few repeat purchases or keywords.
>>>>>
>>>>> Good to know about aichat too.
>>>>>
>>>>> That said, here are the results I tried to replicate with the same 
>>>>> data. Note the claim at the bottom: "You can use any account name you 
>>>>> like 
>>>>> as long as it doesn’t contain spaces," and its incongruency with 
>>>>> literally 
>>>>> every classification containing a space. One of the limitations of the 
>>>>> current generation is a lack of a confidence-in-results indicator, which 
>>>>> shows up here.
>>>>>
>>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An Instructional 
>>>>> Sketchbook" Assets:Zero-Sum-Accounts:Amazon-Purchases -24.71 USD 
>>>>> Expenses:Art Supplies 2022-09-14 * "Amazon.com" "Amazon Basics Clear 
>>>>> Sheet 
>>>>> Protectors for 3 Ring Binder, 8.5 x 11 Inch, 200-Pack" 
>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -18.16 USD Expenses:Office 
>>>>> Supplies 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 
>>>>> /3397011429 
>>>>> Original Equipment Replacement- 12\" (Pack of 1)" 
>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -11.6 USD Expenses:Auto Parts 
>>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric Cooktop, 
>>>>> Electric Hot Plate for Cooking, Electric Stove with Sensor Touch Control, 
>>>>> Portable Infrared Burner with T" 
>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases 
>>>>> -69.43 USD Expenses:Kitchen Appliances 2023-01-26 * "Amazon.com" "PILOT 
>>>>> Parallel Caligraphy Pens, Assorted Point Sizes and Colors, 6 Count 
>>>>> (12722)" 
>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -32.44 USD Expenses:Art 
>>>>> Supplies 
>>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer 
>>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible 
>>>>> with MacBook Air Pro, HP, Lenovo, D" 
>>>>> Assets:Zero-Sum-Accounts:Amazon-Purchases -14.29 USD Expenses:Laptop 
>>>>> Accessories 
>>>>>
>>>>> In this example, I’ve added a category to each transaction by adding a 
>>>>> new line after the transaction line and indenting it by two spaces. The 
>>>>> category is specified as an account name and is preceded by a colon. You 
>>>>> can use any account name you like as long as it doesn’t contain spaces.
>>>>>
>>>>> On Saturday, April 1, 2023 at 7:51:20 AM UTC-7 [email protected] wrote:
>>>>>
>>>>>> I used a command-line tool, aichat 
>>>>>> <https://github.com/sigoden/aichat>, to pipe transactions to GPT-4 
>>>>>> and categorize expenses. I gave it the following input (generated 
>>>>>> according 
>>>>>> to Booking Amazon Purchases 
>>>>>> <https://reds-rants.netlify.app/personal-finance/booking-amazon-purchases/>
>>>>>> ):
>>>>>>
>>>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An 
>>>>>> Instructional Sketchbook"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -24.71 USD
>>>>>>
>>>>>> 2022-09-14 * "Amazon.com" "Amazon Basics Clear Sheet Protectors for 3 
>>>>>> Ring Binder, 8.5 x 11 Inch, 200-Pack"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -18.16 USD
>>>>>>
>>>>>> 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 /3397011429 
>>>>>> <(339)%20701-1429> Original Equipment Replacement- 12\" (Pack of 1)"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -11.6 USD
>>>>>>
>>>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric 
>>>>>> Cooktop, Electric Hot Plate for Cooking, Electric Stove with Sensor 
>>>>>> Touch 
>>>>>> Control, Portable Infrared Burner with T"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -69.43 USD
>>>>>>
>>>>>> 2023-01-26 * "Amazon.com" "PILOT Parallel Caligraphy Pens, Assorted 
>>>>>> Point Sizes and Colors, 6 Count (12722)"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -32.44 USD
>>>>>>
>>>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer 
>>>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible 
>>>>>> with MacBook Air Pro, HP, Lenovo, D"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -14.29 USD
>>>>>>
>>>>>> Using the following prompt and command:
>>>>>>
>>>>>> cat amz.beancount | aichat add categorized expense postings to 
>>>>>> balance each of the following transactions, preserving the included zero 
>>>>>> sum posting. do not use spaces in account names > out.beancount
>>>>>>
>>>>>> I got these results:
>>>>>>
>>>>>> 2022-08-19 * "Amazon.com" "Beginning Drawing Atelier: An 
>>>>>> Instructional Sketchbook"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -24.71 USD
>>>>>>   Expenses:Art:Instructional-Sketchbook
>>>>>>
>>>>>> 2022-09-14 * "Amazon.com" "Amazon Basics Clear Sheet Protectors for 3 
>>>>>> Ring Binder, 8.5 x 11 Inch, 200-Pack"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -18.16 USD
>>>>>>   Expenses:Office-Supplies:Sheet-Protectors
>>>>>>
>>>>>> 2022-10-05 * "Amazon.com" "Bosch Rear Wiper Blade H307 /3397011429 
>>>>>> <(339)%20701-1429> Original Equipment Replacement- 12\" (Pack of 1)"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -11.6 USD
>>>>>>   Expenses:Car:Maintenance
>>>>>>
>>>>>> 2023-01-26 * "Amazon.com" "Duxtop 1600W Single Burner Electric 
>>>>>> Cooktop, Electric Hot Plate for Cooking, Electric Stove with Sensor 
>>>>>> Touch 
>>>>>> Control, Portable Infrared Burner with T"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -69.43 USD
>>>>>>   Expenses:Kitchen:Appliances
>>>>>>
>>>>>> 2023-01-26 * "Amazon.com" "PILOT Parallel Caligraphy Pens, Assorted 
>>>>>> Point Sizes and Colors, 6 Count (12722)"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -32.44 USD
>>>>>>   Expenses:Art:Calligraphy-Pens
>>>>>>
>>>>>> 2023-02-28 * "Amazon.com" "Laptop Stand, Laptop Holder Riser Computer 
>>>>>> Stand, Adjustable Aluminum Foldable Portable Notebook Stand, Compatible 
>>>>>> with MacBook Air Pro, HP, Lenovo, D"
>>>>>>   Assets:Zero-Sum-Accounts:Amazon-Purchases  -14.29 USD
>>>>>>   Expenses:Office-Supplies:Laptop-Stand
>>>>>>
>>>>>> There was some trial and error that resulted in the final prompt. At 
>>>>>> first, it removed the zero-sum postings and only output expenses. The 
>>>>>> expense accounts had spaces in their names (e.g. "Expenses:Office 
>>>>>> Supplies:Laptop Stand"). There is still room for improvement (I don't 
>>>>>> really care to have an account for laptop stands). Overall, it looks 
>>>>>> very 
>>>>>> helpful. Perhaps a good workflow is to use GPT to categorize a small 
>>>>>> batch 
>>>>>> of expenses, manually correct them, and then use smart_importer 
>>>>>> <https://github.com/beancount/smart_importer> to handle the rest.
>>>>>>
>>>>>> If anyone else has tips please share!
>>>>>>
>>>>>>
>>>>>> On Saturday, April 1, 2023 at 1:53:46 AM UTC-4 Red S wrote:
>>>>>>
>>>>>>> Great idea. Talk about drudgery!
>>>>>>>
>>>>>>> On Friday, March 31, 2023 at 12:35:09 AM UTC-7 [email protected] 
>>>>>>> wrote:
>>>>>>>
>>>>>>> I haven't tried yet but my hope is that gpt would be great at 
>>>>>>> writing test code. Can you imagine?
>>>>>>>
>>>>>>> On Fri, Mar 31, 2023, 01:03 Red S <[email protected]> wrote:
>>>>>>>
>>>>>>> Fantastic! Hooray for reducing drudgery!
>>>>>>>
>>>>>>> It's apparently quite effective at using code from existing github 
>>>>>>> projects. I love the code explanations in natural language.
>>>>>>>
>>>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Beancount" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/beancount/b08ca160-539a-4d7a-ba9c-8540f214d2bcn%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/beancount/b08ca160-539a-4d7a-ba9c-8540f214d2bcn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>>
>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "Beancount" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/beancount/_7udAMlfLAg/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/beancount/CAK21%2BhOGB19cwNOcyBmvEbZAcwNCEnUuN-k-io3ifbkZe3vheA%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/beancount/CAK21%2BhOGB19cwNOcyBmvEbZAcwNCEnUuN-k-io3ifbkZe3vheA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/13eade17-f015-4bcb-a835-1fee19a6f4een%40googlegroups.com.

Reply via email to