D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir added inline comments. INLINE COMMENTS > dfaure wrote in kuitmarkup.cpp:1597 > I thought it was the resolved version, without entities (if they were > successfully resolved). If the entity is unknown it'll be left as is, I don't know how often that actually happens though. REPOSITORY R249 KI18n REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir added a task: T12279: Port frameworks away from QRegExp. REPOSITORY R249 KI18n REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
This revision was automatically updated to reflect the committed changes. Closed by commit R249:f9b932d254c4: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression (authored by ahmadsamir). REPOSITORY R249 KI18n CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D26285?vs=72567=72569 REVISION DETAIL https://phabricator.kde.org/D26285 AFFECTED FILES src/kuitmarkup.cpp To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
dfaure accepted this revision. REPOSITORY R249 KI18n BRANCH l-finalizeVisualText (branched from master) REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir updated this revision to Diff 72567. ahmadsamir added a comment. Ternary that doesn't return a value is weird, better avoid. "Grammar fixes" in comments add noise to the diff, be kind to reviewers. REPOSITORY R249 KI18n CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D26285?vs=72539=72567 BRANCH l-finalizeVisualText (branched from master) REVISION DETAIL https://phabricator.kde.org/D26285 AFFECTED FILES src/kuitmarkup.cpp To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
dfaure accepted this revision. dfaure added a comment. This revision is now accepted and ready to land. I wonder if others agree that "a ? b() : c()" is bad form for void b() and void c(), compared to an if(), or if it's just me. INLINE COMMENTS > kuitmarkup.cpp:1582 > +// unknown Unicode point, leave it as is > +ok ? plain.append(c) : plain.append(match.captured(0)); > } else if (s->xmlEntities.contains(ent)) { // known entity I don't really like ternary operators that don't return a value, this is really an if(). (I guess the QChar/QString type difference is the reason why append(ok?c:...) wouldn't work). > ahmadsamir wrote in kuitmarkup.cpp:1572 > I wouldn't know about C, having never written any code with it. > > > This is in no way more efficient, construction isn't very different from > > assignment. > > That is the key I was missing (which means I should do some research, before > "assuming" stuff). My statement doesn't apply to all classes obviously, the ctor/dtor could be very expensive and then my statement is incorrect. But QString is really just a wrapper around a QStringPrivate "d", and both assignment or constructor (from another QString) just do some refcounting and grab the "d" of the other QString, so it's the same. > ahmadsamir wrote in kuitmarkup.cpp:1597 > I concede that changing var names makes reading the diff harder, and I > should, if needed, have done it in a separate patch. > > My, weird, logic is that "original" is the original string that is being > modified, v.s. text which is a temporary place to store the parts of original > after processing them to prevent the match looping on one singular place in > the string. Simply original didn't translate to constant in my head. (And > "plain" didn't make sense, it's not plain when it may have "" appended > to it). > > Anyway, I'd rather change them back than argue. :) I thought it was the resolved version, without entities (if they were successfully resolved). REPOSITORY R249 KI18n BRANCH l-finalizeVisualText (branched from master) REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir added inline comments. INLINE COMMENTS > dfaure wrote in kuitmarkup.cpp:1342 > Duplication :-) > > Dangerous, if the actual char* is modified one day. Personally, I needed to see the actual regex spelled out like that to understand the rest of the code in this function, this: QLatin1String("^(") + QLatin1Strig(s_entitySubRx) + QLatin1String(");") doesn't work as well for me, and especially so with finalizeVisualText(), where ENTITY_SUBRX is many lines away, and there are two capture groups used. I thought of doing away with ENTITY_SUBRX and just construct the regex's in toVisualText() and finalizeVisualText(), but the two must use the same pattern, except for the beginning, ^ and & respectively. I can try lessening the danger by tweaking the comment. > dfaure wrote in kuitmarkup.cpp:1572 > Please try to always declare the variables where they are used. The old code > said "QString ent = ...", why does the new code look more like C, declaring > variables outside the loop? > This is in no way more efficient, construction isn't very different from > assignment. I wouldn't know about C, having never written any code with it. > This is in no way more efficient, construction isn't very different from > assignment. That is the key I was missing (which means I should do some research, before "assuming" stuff). > dfaure wrote in kuitmarkup.cpp:1597 > So it's not the "original" value anymore. Why the renaming of plain to text, > and of text to original? It makes the diff harder to review (but if you > convince me that it makes the code easier to read, that's fine). I'm just not > sure that "original" for something that is modified in many places (line 1592 > and here) makes sense as a name. I concede that changing var names makes reading the diff harder, and I should, if needed, have done it in a separate patch. My, weird, logic is that "original" is the original string that is being modified, v.s. text which is a temporary place to store the parts of original after processing them to prevent the match looping on one singular place in the string. Simply original didn't translate to constant in my head. (And "plain" didn't make sense, it's not plain when it may have "" appended to it). Anyway, I'd rather change them back than argue. :) REPOSITORY R249 KI18n REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir updated this revision to Diff 72539. ahmadsamir edited the summary of this revision. ahmadsamir added a comment. Address comments REPOSITORY R249 KI18n CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D26285?vs=72363=72539 BRANCH l-finalizeVisualText (branched from master) REVISION DETAIL https://phabricator.kde.org/D26285 AFFECTED FILES src/kuitmarkup.cpp To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
dfaure requested changes to this revision. dfaure added inline comments. This revision now requires changes to proceed. INLINE COMMENTS > kuitmarkup.cpp:1332 > > -#define ENTITY_SUBRX "[a-z]+|#[0-9]+|#x[0-9a-fA-F]+" > +const QString ENTITY_SUBRX = QStringLiteral("[a-z]+|#[0-9]+|#x[0-9a-fA-F]+"); > This creates a QString at program startup. Please change it to static const char s_entitySubRx[] = "..."; > kuitmarkup.cpp:1342 > // but do not touch & which forms an XML entity as it is. > +// Regex is: ^([a-z]+|#[0-9]+|#x[0-9a-fA-F]+); > +static const QRegularExpression restRx(QLatin1String("^(") + > ENTITY_SUBRX + QLatin1String(");")); Duplication :-) Dangerous, if the actual char* is modified one day. > kuitmarkup.cpp:1567 > if (format != Kuit::RichText) { > -static QRegExp staticEntRx(QLatin1String("&(" ENTITY_SUBRX ");")); > -QRegExp entRx = staticEntRx; // QRegExp not thread-safe > -int p = entRx.indexIn(text); > -QString plain; > -while (p >= 0) { > -QString ent = entRx.capturedTexts().at(1); > -plain.append(text.midRef(0, p)); > -text.remove(0, p + ent.length() + 2); > +// regex is: (&([a-z]+|#[0-9]+|#x[0-9a-fA-F]+);) > +static const QRegularExpression entRx(QLatin1String("(&(") + > ENTITY_SUBRX + QLatin1String(");)")); (same) > kuitmarkup.cpp:1572 > +QString text; > +QString ent; > +while (match.hasMatch()) { Please try to always declare the variables where they are used. The old code said "QString ent = ...", why does the new code look more like C, declaring variables outside the loop? This is in no way more efficient, construction isn't very different from assignment. > kuitmarkup.cpp:1597 > if (format == Kuit::RichText) { > -text = QStringLiteral("") + text + QStringLiteral(""); > +original = QLatin1String("") + original + > QLatin1String(""); > } So it's not the "original" value anymore. Why the renaming of plain to text, and of text to original? It makes the diff harder to review (but if you convince me that it makes the code easier to read, that's fine). I'm just not sure that "original" for something that is modified in many places (line 1592 and here) makes sense as a name. REPOSITORY R249 KI18n REVISION DETAIL https://phabricator.kde.org/D26285 To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns
D26285: [KuitFormatterPrivate] Start porting QRegExp to QRegularExpression
ahmadsamir created this revision. ahmadsamir added reviewers: Frameworks, ilic, dfaure, mlaurent, aacid. Herald added a project: Frameworks. ahmadsamir requested review of this revision. REVISION SUMMARY Port toVisualText() and finalizeVisualText() to use QRegularExpression. Replace ENTITY_SUBRX macro with a QString. TEST PLAN make && ctest REPOSITORY R249 KI18n BRANCH l-finalizeVisualText (branched from master) REVISION DETAIL https://phabricator.kde.org/D26285 AFFECTED FILES src/kuitmarkup.cpp To: ahmadsamir, #frameworks, ilic, dfaure, mlaurent, aacid Cc: kde-frameworks-devel, ltoscano, LeGast00n, GB_2, michaelh, ngraham, bruns