Thank you Adrien, David and @flywire for your very helpful responses. I actually created a script that would transform different file formats and data structures and through a machine learning classifier autopopulate the "Transfer Account '' depending on the description.
My script would scour through the different formats since banks generally are not standardized and transform these different files into a standard simple CSV with DATE, DESCRIPTION, ACCOUNT, DEPOSIT, WITHDRAWAL. After transformation, I run it through my classifier and that will add the TRANSFER ACCOUNT field and autopopulate with the predicted transfer account that is an exact string match, eg if transfer account is "XXXYYYZZZ" in the csv, there is such an account in GNUCASH that the exact string equivalent "XXXYYYZZZ". I figured that if I explicitly include the specific Transfer Account per line in the csv, it should help GNUCASH match. This could save me a lot of time since I don't have to transform each bank account statement or logs individually. I have lots of accounts. I can imagine I'm not alone in this case. I understand from your comments that GNUCASH follows a Naive-bayes algorithm to predict the TRANSFER ACCOUNT given the description and you have to feed it transactions little by litle for it to learn. But as I have said, I have already pre-processed that data so GNUCASH does not have to do that. I explicitly feed it with the exact transfer account. I really want to avoid doing the import account by account if I could. On Wed, Jun 3, 2020 at 5:21 AM David Cousens <[email protected]> wrote: > Gio, > > The Gnucash import matching procedure searches for duplicate transactions > in a time window around the date of the > imported recorded set at +-42 days. A match score is then calculated based > on the differenes in dates and amounts and > the matching of tokenized data. It is generally very good at picking up > duplicates but will sometimes match repular > repeated payments. An imported transition which matches an existing one > will have the "C" ("R" in older GnuCash > versoions) checkbox checked if the match is very good but may have the > "U+C" ("U+R" in older GnuCash versions) > chceckbox checked if it is not a very close match ( usually if > descriptions and memo fields are different. It will pay > to check all matches are actually valid. You can swap between viewing the > register(s) and the import matcher window. > The Help manual section on importing will help you with understanding the > significance of the background colours to the > rowsand the meaning of the checkboxes in the importer. > > https://www.gnucash.org/docs/v3/C/gnucash-help/trans-import.html#:~:text=Navigate%20to%20the%20MT940%2C%20MT942,the%20transactions%20in%20the%20file > . > > Even though you are specifying the transfer accounts the importer will ask > to to map the account name in the trnsfer > field to an internal GnuCash account each time it encounters an account > name for the first time in the import data. > > David Cousens > > > > On Tue, 2020-06-02 at 17:57 +0800, Gio Bacareza wrote: > > Hi GNUCash experts, > > > > I'm planning to consolidate all my statements into 1 big CSV. This CSV > > would naturally have an account field and a transfer field. > > > > Since all transactions are there, there will be instances where there > will > > be duplicates. For example, a statement from bank1 could have a transfer > > from bank1 to bank2. So account = bank 1, transfer = bank 2, amount = > -100 > > for example. But, since I put in all together in 1 file, there will be > > another line where account = bank 2, transfer = bank1, amount = 100. > > > > When I import this big file containing all transactions from different > > statements, how will gnucash handle this? Can it automatically detect > that > > this is one and the same transaction? > > > > The alternative is to import account by account but that is too time > > consuming for me. > > > > Is there a better way? > > > -- > Dr David R Cousens > B.Sc, M.Prof. Acc., Ph.D., G.C.Ed > > -- cheers, Gio _______________________________________________ gnucash-user mailing list [email protected] To update your subscription preferences or to unsubscribe: https://lists.gnucash.org/mailman/listinfo/gnucash-user If you are using Nabble or Gmane, please see https://wiki.gnucash.org/wiki/Mailing_Lists for more information. ----- Please remember to CC this list on all your replies. You can do this by using Reply-To-List or Reply-All.
