On Tuesday 12 January 2016 10:11:25 Derek Atkins wrote: > Hi, > > On Tue, January 12, 2016 9:52 am, Mike Evans wrote: > > Hi Geert. > > > > I'd appreciate some advice on this bug, since you were that last > > person to touch the (makes my head hurt) regex. > > > > In file dialog-bi-import-gui.c line 328 The regex for description, > > and notes is currently: > > > > ((?<desc>[^\",]*)|\"(?<desc>[^\"]*)\")\" > > This regex is basically looking for anything within double-quotes, > except for another double-quote. > > The issue would be handling something like: > > "<some text>""<more text>" > > I.e., in order to escape a double-quote you use a double-double-quote. > This regex does not handle that case. So it's basically saying "get > me everything between the double quotes (without acknowledging the > double-double-quote scenario. > > > I'm not a regex guru but it seems to me that losing the [^\"] part > > and just using . would accept the problem lines. This wouldn't > > strip the extra " from the escaped quote, but it would at least be > > imported and editable later. I'd have thought that just accepting > > everything inside the quoted field would be the correct behaviour? > > Unfortunately I don't think that would work. The construct: > > [^\"]* > > says to match anything but a double-quote. More likely we need to > change it to: > > (?<desc>([^\"]|\"\")*) > > I think this will tell it to match anything but a double-quote, or a > double-double-quote, as many times as they occur. > > Can you try this? > > > Mike E > > -derek
Wow Derek, you're fast... I saw your response on the list before I even received Mike's original question... Anyway, I would also go for your suggestion. Simply replacing [^\"] with a "." could cause the rexexp to match too much. Regards, Geert _______________________________________________ gnucash-devel mailing list [email protected] https://lists.gnucash.org/mailman/listinfo/gnucash-devel
