The slot value is a score. You may find that some fragments have several 
entries with different scores. The Bayes matcher finds all of the fragments in 
a string and sums up the scores for each GUID. The GUID with the highest sum is 
the match.

For example suppose you have the string “Joe’s Fish Market” and the following 
table represented part of your match results:

Joe's         123    1
Joe's         7a3    2
Joe's         9b6    1
Fish          9b6    1
Meat         7a3    2
Barber       123   1
Emporium  7a3   2
Market       9b6   1
Shop          123   1

That would result in the following scores:

123   1
7a3   2
9b6   3

And the matcher would select account 9b6.

Regards,
John Ralls

> On Dec 18, 2018, at 4:34 PM, Steve Cohen <stevec...@gmail.com> wrote:
> 
> OK, I figured out that .gnucash does not describe the file format which is 
> either compressed or non-compressed XML depending on the compression setting 
> you choose.
> 
> So I switched to non-compressed and look at the bayesian elements and it's 
> not what I would have expected.  The expressions that are mapped from are not 
> phrases but "words."  Something like
> 
>    <slot>
> <slot:key>import-map-bayes/INDEPENDENCE/f5cb4b5b31decc01c394dd7170078254</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INDIA/af48360c2fb9b039b4707ad7d7517950</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INGTON/94ec6c9aae683c9125fb0dd2b1bb8846</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INN/c6447afebc9564fded7d1bafbe1e026e</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INTEREST/b572baae5a56a30ce384ab58ff12ed7d</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INTUIT/9c204d33baf137f4f0b078f9b61531d1</slot:key>
>      <slot:value type="integer">1</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/INVESTM/5920c9dbe1d24308893a5eeb32d01e09</slot:key>
>      <slot:value type="integer">3</slot:value>
>    </slot>
>    <slot>
> <slot:key>import-map-bayes/IS/c6447afebc9564fded7d1bafbe1e026e</slot:key>
>      <slot:value type="integer">4</slot:value>
>    </slot>
> 
> So I am trying to understand how these are applied.  I get that the long hex 
> numbers are GUIDs representing accounts and that the expressions before this 
> are bits of the transaction description.  But what if the transaction 
> description is multiple words, each mapping to a different account?  
> Obviously "INVESTM" and "IS" are going to be pulled in many different 
> directions.  How does "INGTON" get in there?  Why isn't it "WAASHINGTON"? So 
> I'm trying to understand how this works at all.
> 
> I know that it does, but I can't imagine how.
> 
> The long hex numbers are GUIDs corresponding to accounts.
> On 12/18/18 5:59 PM, Stephen M. Butler wrote:
>> On 12/18/18 3:31 PM, Steve Cohen wrote:
>>> Thanks.
>>> 
>>> Seems like none of these solutions will work if your data is stored as a 
>>> .gnucash file, they only work with .xml files.
>>> 
>>> Is there a way to convert this?
>>> 
>>> Is the Bayesian matching applied to entries that are corrected in the 
>>> account editor, or is it only applied to entries made in the importer?
>>> 
>>> I am somewhat comfortable with the bleeding edge, but, when is the release 
>>> of version 4 expected?
>>> 
>>> 
>>> On 12/18/18 5:17 PM, David Cousens wrote:
>>>> Steve
>>>> 
>>>> These may help.
>>>> https://wiki.gnucash.org/wiki/Bayes
>>>> https://lists.gnucash.org/pipermail/gnucash-user/2016-July/066299.html
>>>> http://gnucash.1415818.n4.nabble.com/Fixing-confused-bayesian-matching-data-td4685819.html
>>>>  
>>>> http://blog.jdlh.com/en/2016/07/29/resetting-gnucashs-import-transaction-matching/
>>>>  
>>>> 
>>>> Make a backup of your data file and only work on a copy until you are sure
>>>> it is working after changing it if you attempt any of the solutions
>>>> mentioned in the above posts.
>>>> 
>>>> The importer stores the map data and probabilities during the final step of
>>>> the import process. If you let transactions go through to Imbalance then it
>>>> obviously gets no data to work with. If you assign all transactions to a
>>>> specific transfer account before import and continue to do that, it will
>>>> eventually correct itself. There are a few situations in which the bayesian
>>>> matcher does not work. I find where there is a transaction unique number
>>>> which changes with each periodic transaction the matcher seems to run into
>>>> problems. An number identifying the payer/payee and not the transaction
>>>> itself is OK. Some of mine have both.
>>>> 
>>>> There will be a feature to be added in GnuCash V4 which allows multiple
>>>> selection of transactions and assignment of a single transfer account in 
>>>> the
>>>> import matcher which speeds up the transaction matching process
>>>> significantly. It can be incorporated in V3.x as a patch if you build
>>>> GnuCash from source, but the risk is that future bug fixes in the importer
>>>> which change the two affected files could result in a non-working GnuCash.
>>>> It incorporated in the master barnch of the GitHub repository and can be
>>>> built from that if you are comfortable working with the bleeding edge.
>>>> 
>>>> David Cousens
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -----
>>>> David Cousens
>>>> -- 
>> Steve,
>> In GnC, click on the Tools menu and then on the Import Map Editor.  Once on 
>> the new screen you can see all the mappings that have been generated.
>> In my case, I did some restructuring of my accounts and found that the 
>> existing mappings no longer worked.  I highlighted the top levels and 
>> clicked on the DELETE key.  That reset everything for me and I'm in the 
>> process of building the new set of mappings.
>> The high level is based on the imports you do.  I had three: Checking 
>> account, Credit Card, and Savings account.  The last one is used so little 
>> that it isn't worth the hassle of downloading the 1-2 entries each month so 
>> I now enter them by hand.  That will leave me with just two imports -- which 
>> I plan to do multiple times each month to keep the number of transactions 
>> low.
>> Anyway, if you decide to clear everything out, the above is a nice and easy 
>> way to do that.
>> --Steve
> 
> _______________________________________________
> gnucash-user mailing list
> gnucash-user@gnucash.org
> To update your subscription preferences or to unsubscribe:
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> If you are using Nabble or Gmane, please see 
> https://wiki.gnucash.org/wiki/Mailing_Lists for more information.
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.

_______________________________________________
gnucash-user mailing list
gnucash-user@gnucash.org
To update your subscription preferences or to unsubscribe:
https://lists.gnucash.org/mailman/listinfo/gnucash-user
If you are using Nabble or Gmane, please see 
https://wiki.gnucash.org/wiki/Mailing_Lists for more information.
-----
Please remember to CC this list on all your replies.
You can do this by using Reply-To-List or Reply-All.

Reply via email to