Re: load_file omits some entries (balances)

Florian Lindner Sun, 19 May 2019 12:02:55 -0700

Am 17.05.19 um 01:01 schrieb Martin Blais:> Alright now I see what you want 
to do.
> You want to rewrite your payees, but in the source file itself.
> That's a nice idea.
Thanks!
> However, I don't think you'll be able to put together a nice solution 
with rewriting after processing.
> I would work off the source text itself.
> Or even better: as a combination of both.
> Here's what you could do: parse the entire thing, filter just the 
transactions.
> For each transaction  you have the filename and line number.
> Do whatever remapping / processing / cleaning you want to do on the payee 
names in your script.
> Then, process each file, using a regexp to replace the first string that 
occurs on the lines where you have transactions with renamed payees.
> 
> This is better than working purely from the source file because you won't 
have to write a full alternative parser to make your replacements; all you 
need to ace is replacement of the first string on those transaction lines 
and leave all the other lines untouched. Should be pretty easy and robust 
enough (tip: make sure you safeguard your files in a git/hg repo and diff 
just in case). The benefit is your source files will keep all the other 
formatting and comments and spacing and and ordering and whatever else.
> 
> This is how I'd go about this.
> I think it would even be possible to template this and provide helper 
functions.
Ok, I understand what you're suggesting, but I am not really sure if that 
is the way to go. For an easy case, such as replacing payees it is ok, but 
I think for more complex tasks, like adding new meta data fields, changing 
accounts, or even splitting transactions between accounts a 
search-and-replace approach will evolve into just rewriting the entire 
transactions in the source file from the Transaction object.


Right now, I think reading in the beancount file into a string, parse them 
using bc.parser.parser.parse_many and perform the transformations is the 
best way for me. Then, rewrite the entire file using 
bc.parser.printer.print_entries.

You wrote:

> Still, when you write entries out, they won't look precisely the same as 
the input. Numbers will have been filled in, cost bases will show up, etc. 
I don't see the point.

Given my very simple transactions, e.g.,

2018-05-20 * "KREDITKARTENABRECHNUNG" "18.05.18 1234"
  buchungsart: "Lastschr.Kreditkarten"
  empfaenger: " / "
  hash: "e4a580e7002e606a4314f864f64f30a12fb8673f"
  Assets:Giro       -115.00 EUR
  Expenses:Unknown             

Rewriting the ledger file, as I mentioned above, does not change an entry 
like that.

As I just use simple beancount syntax, but potentially want to use more, do 
you consider that kind of rewriting a problem?

In the long run, I think a rewriting protocol would make a beneficial 
addition to beancount, as Stefano also suggested.

Maybe something like the importer protocol:

class MyRewriter:

  def rewriteTransaction(self, txn):
    return txn

  def rewriteBAlance(self, bal):
    return bal

or alike, one function for all types. Then you invoke bean-rewrite on a 
file or a set of transactions. Just a first idea...  ;-)

Best Regards,
Florian


> On Wed, May 15, 2019 at 4:12 AM Florian Lindner <[email protected] 
<mailto:[email protected]>> wrote:
> 
>     Hi,
> 
>     Am 15.05.19 um 02:57 schrieb Martin Blais:> But why are you trying to 
do this? What's your purpose?
>     My importer applies a set of rules to convert payee names and assign 
certain kind of transactions to accounts:
> 
>     # List of tuples (regular expression, replacement)
>     payee_replacements = [
>         ("^AMAZON", "Amazon"),
>     ]
> 
>     # List of tuples (python expression to match, second account to set)
>     accounts_assignments = [
>         ("desc == 'Miete PSW 1'", "Expenses:Miete"),
>         ("payee in ['REWE', 'Kaufland', 'ALDI']", "Expenses:Groceries"),
>         ("True", "Expenses:Unknown")
>     ]
> 
> 
>     def transform_txn(txn):
>         payee = txn.payee
> 
>         for pattern, substitute in payee_replacements:
>             if re.match(pattern, payee):
>                 payee = substitute
>                 break
> 
>         txn = txn._replace(payee = payee)
> 
>         local_vars = {"payee" : txn.payee, "desc" : txn.narration, 
"buchungsart" : txn.meta["buchungsart"]}   
>         if txn.postings[1].account == "Expenses:Unknown":
>             for expr, acc in accounts_assignments:
>                 if eval(expr, local_vars):
>                     account = acc
>                     break
> 
>             txn.postings[1] = txn.postings[1]._replace(account = account)
>            
>         return txn
> 
> 
>     These two rulesets are applied on import.
> 
>     I want to also apply them on existing ledgers.
> 
>     Usecase: I identify a recurring transaction pattern, such as 
"buchungsart == 'GAA,Spk.Netz'. All matching transaction to imported as 
well as existing ones should have the account "Assets:Bargeld" assigned. 
For that, I need a method to read in all transactions, transform them and 
write them to a beancount file.
> 
>     This is my solution to this question: 
https://groups.google.com/forum/#!topic/beancount/e93VI4s4YCQ
> 
>     An alternative approach are plugins. So far I understand plugins they 
only apply live transformations, i.e., they transform data as it is loaded 
from a file, but do not write back the data to the file.
> 
>     >> A workaround I see, is to read in main.beancount and write out the 
entries to different files based on entries[6].meta["filename"]. Basically 
rewriting the entire ledger.
>     >
>     > I was going to suggest this.
>     > Still, when you write entries out, they won't look precisely the 
same as the input. Numbers will have been filled in, cost bases will show 
up, etc. I don't see the point.
>     Yes, I have noticed that, but that seems ok to me.
> 
>     I hope I was able to explain my use case. I am open to any thoughts 
and ideas to achieve that differently.
> 
>     Best Regards,
>     Florian
> 
>     >
>     >
>     >
>     > On Mon, May 13, 2019 at 10:36 AM Florian Lindner 
<[email protected] <mailto:[email protected]> 
<mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >         I see.
>     >         Well FWIW, entries which have errors are not guaranteed to 
show up in the output stream at all.
>     >         It's unclear to me whether this is always the best outcome, 
but a long while ago I decided to do this for transactions and for some 
other directives.
>     >         
https://bitbucket.org/blais/beancount/src/d1b2cbf2841669e988f6692ec1d39db3708730cc/beancount/ops/balance.py#lines-119
>     >
>     >         I don't have a solution for you. This is an unusual case.
>     >
>     >
>     >     I tried to apply the workaround I mentioned:
>     >
>     >         entries, error, option_map = 
bc.loader.load_file(args.inputfile)
>     >         sorted_entries = {} # file -> list of entries
>     >
>     >         for e in entries:
>     >             entry = transform_txn(e) if type(e) == data.Transaction 
else e
>     >             name = entry.meta["filename"]
>     >             sorted_entries[name] = sorted_entries.get(name, []) + 
[entry]
>     >      
>     >         for filename in sorted_entries:
>     >             with open(filename, "w") as f:
>     >                 
bc.parser.printer.print_entries(sorted_entries[filename], file = f)
>     >
>     >     A problem that shows up, is that in main.beancount I have some 
options set (e.g. operation_currency). They don't show up in entries, but 
in option_map. However, I don't know how to write them to file.
>     >
>     >     Another idea: At a first try, it seems that reading the entire 
file into a string and use |beancount.parser.parser.||parse_many|would work 
and also parser the balances:
>     >
>     >         with open(args.inputfile, "r") as f:
>     >             instr = f.read()
>     >       
>     >         entries = bc.parser.parser.parse_many(instr)
>     >
>     >     Seems to work fine so far. What do you think?


-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/6a6ce568-344c-4978-ae45-e6a7f7206bed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: load_file omits some entries (balances)

Reply via email to