On 5 October 2014 13:26, Felix Schumacher <[email protected]> wrote: > Am 05.10.2014 um 11:30 schrieb sebb: > >> On 4 October 2014 19:41, Philippe Mouawad <[email protected]> >> wrote: >>> >>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher < >>> [email protected]> wrote: >>> >>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad: >>>> >>>>> Hi Felix, >>>>> >>>> Hi >>>> I agree with sebb, patch is interesting. >>>>> >>>>> But it clearly needs to be documented (I think many users don't know >>>>> about >>>>> this feature which is really interesting) as long as code, reading >>>>> patch >>>>> first it wasn't clear for me what was intended. >>>>> >>>> I have added documentation to the patch and found two other things, that >>>> I >>>> changed >>>> in the same bug-entry. >>>> >>>> The random order of applying the matchers, seems a bit strange, so I >>>> sorted the matchers >>>> first by their length and if the matchers are the same length, then by >>>> the >>>> name of their keys. So >>>> the set >>>> {'domain': 'example.com', 'server': 'www', 'regex': 'w.*' } >>>> would be applied in the order ['domain', 'regex', 'server'] since >>>> 'domain' >>>> has the longest matcher and >>>> 'regex' comes before 'server' alphabetically (matchers are both the same >>>> length). >>>> >>> Isn't it better to order by longest value or regexp ? >>> www is more specific than w.* >>> So would be : >>> domain, server , regex >> >> Or the code could try to match every variable and select the one that >> produces the longest match. >> >> But rather than try and sort the regexes, which is always going to be >> tricky to do "correctly" (whatever that means), maybe the user should >> be given control of the matching order. >> >> For example, it is probably possible to match by order of appearance. >> >> It would certainly be possible to match the variables in sorted order by >> name. >> This would be a bit more awkard to use than changing the order of >> variable definitions. > > I just wanted to give a simple algorithm for ordering, which I think is > better than random ordering. > > Correctness will be hard to implement, when everyone has a different view on > the correct ordering. > > I had thought of giving more control to the user by appending the variable > names with something to sort by. > > For example extending the above example with variable names ['domain', > 'server', 'regex'] the names could be > changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in the > order ['server', 'regex', 'domain']. > But what should we do with the suffix '_\d+'? (A prefix could be used, too) > > We could look for a specially named variable like '_regex_order' which could > have a comma separated list of > the variable names in the wished order. > > The longer I think about it, the more I am inclined to take the simple > ordering algorithm of length and then name. One can > always make any regex longer by adding useless junk like > '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence > the order.
No, length of regex is not useful. More useful would be sorting by matched string. Sorting by name is awkward to use, and anyway what about non-regexes that happen to match the same text? I don't think it's possible to automatically sort correctly by regex. So we should allow the user to control the search order, as I already suggested a short while ago. > Felix > >> >>> >>>> If no one objects, I will submit it next week. >>>> >>>> Regards >>>> Felix >>>> >>>>> Thanks for contributing >>>>> Regards >>>>> >>>>> >>>>> On Monday, September 29, 2014, sebb <[email protected]> wrote: >>>>> >>>>> On 29 September 2014 15:49, Felix Schumacher >>>>>> >>>>>> <[email protected] <javascript:;>> wrote: >>>>>> >>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <[email protected] >>>>>>> >>>>>> <javascript:;>>: >>>>>> >>>>>>> On 29 September 2014 11:24, Felix Schumacher >>>>>>>> >>>>>>>> <[email protected] <javascript:;>> wrote: >>>>>>>> >>>>>>>>> Am 29.09.2014 11:56, schrieb sebb: >>>>>>>>> >>>>>>>>> On 28 September 2014 18:11, Felix Schumacher >>>>>>>>>> >>>>>>>>>> <[email protected] <javascript:;>> wrote: >>>>>>>>>> >>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga: >>>>>>>>>>> >>>>>>>>>>> I've attached a jmeter project file and a html file that >>>>>>>>>>> >>>>>>>>>> demonstrates the >>>>>>>>> >>>>>>>>> issue. In order to reproduce: >>>>>>>>>>> >>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter. >>>>>>>>>>> 2. Start the proxy (recorder) >>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on >>>>>>>>>>> >>>>>>>>>> localhost, do >>>>>>>>> >>>>>>>>> not >>>>>>>>>>> >>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable) >>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy) >>>>>>>>>>> 5. Click both buttons in order. >>>>>>>>>>> >>>>>>>>>>> I could not post to a html file, hence the "test 2" button will >>>>>>>>>>> >>>>>>>>>> post to >>>>>>>>> >>>>>>>>> Google. The page that loads has an error, but it still records the >>>>>>>>>> >>>>>>>>>> post >>>>>>>>> >>>>>>>>> request which is what we want to see. >>>>>>>>>>> >>>>>>>>>>> I also discovered that when I was using a "get" request instead >>>>>>>>>>> >>>>>>>>>> (I've >>>>>>>>> >>>>>>>>> made >>>>>>>>>>> >>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I >>>>>>>>>>> >>>>>>>>>> think >>>>>>>>> >>>>>>>>> this >>>>>>>>>>> >>>>>>>>>>> is related. >>>>>>>>>>> >>>>>>>>>>> The project has a user defined variable called "TEST" with a >>>>>>>>>>> value >>>>>>>>>>> >>>>>>>>>> os >>>>>>>>> >>>>>>>>> ".*", >>>>>>>>>>> >>>>>>>>>>> I've ticked the box >>>>>>>>>>> >>>>>>>>>>> To see the results, in the recording controller the last two >>>>>>>>>>> >>>>>>>>>> requests >>>>>>>>> >>>>>>>>> contain a parameter with these values: >>>>>>>>>>> >>>>>>>>>>> Test 1: %${TEST} >>>>>>>>>>> Test 2: <${TEST}> >>>>>>>>>>> >>>>>>>>>>> Both should be just ${TEST} I believe. >>>>>>>>>>> >>>>>>>>>>> In the current implementation the regex will be matched against a >>>>>>>>>>> >>>>>>>>>> pattern >>>>>>>>> >>>>>>>>> which looks like >>>>>>>>>>> >>>>>>>>>>> \b(YOUR_VALUE)\b >>>>>>>>>>> >>>>>>>>>>> As % and < are boundary characters they are excluded from you >>>>>>>>>>> >>>>>>>>>> pattern. >>>>>>>>>> This is deliberate. >>>>>>>>>> There were problems previously as partial values were being >>>>>>>>>> unexpectedly matched. >>>>>>>>>> >>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678 >>>>>>>>>> >>>>>>>>> I thougt so. Maybe, that would have been helped by adding more >>>>>>>>> documentation, but then it is regex... >>>>>>>>> >>>>>>>>>> I would consider this a bug, or at least documentation could be >>>>>>>>>> a >>>>>>>>>> bit >>>>>>>>> >>>>>>>>> more >>>>>>>>>>> >>>>>>>>>>> concise. >>>>>>>>>>> >>>>>>>>>> Patches welcome. >>>>>>>>>> >>>>>>>>> A patch was attached :) >>>>>>>>> >>>>>>>> I meant that we would welcome a patch for the documentation. >>>>>>>> Or at least some indication of where the documentation needs to be >>>>>>>> updated to clarify the current behaviour. >>>>>>>> >>>>>>> I will look into that. >>>>>>> >>>>>> Thanks. >>>>>> >>>>>> What is your opinion on the option to detect parens and modify the >>>>>> regex >>>>>> behavior? >>>>>> >>>>>> Looks good to me. >>>>>> >>>>>> The parens are very unlikely to have been used in existing tests, so >>>>>> the modified behaviour is unlikely to break anything. >>>>>> But we should document it in the release notes just in case. >>>>>> >>>>>> Felix >>>>>>>> >>>>>>>> Attached is a patch against trunk, which checks the regex if it >>>>>>>>>> >>>>>>>>>> starts >>>>>>>>> >>>>>>>>> with >>>>>>>>>>> >>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of >>>>>>>>>>> >>>>>>>>>> building >>>>>>>>> >>>>>>>>> its >>>>>>>>>>> >>>>>>>>>>> own version. >>>>>>>>>>> >>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of >>>>>>>>>> them. >>>>>>>>>> >>>>>>>>> I have already done so yesterday shortly after sending my mail. It >>>>>>>>> is >>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032 >>>>>>>>> >>>>>>>>> What is missing from the patch is documentation. If the feature as >>>>>>>>> >>>>>>>> such is >>>>>>>> >>>>>>>>> ok, then I would add that to the existing documentation. >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Felix >>>>>>>>> >>>>>>>>>>> Also, see notes below. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: sebb [mailto:[email protected] <javascript:;>] >>>>>>>>>>> Sent: 21 September 2014 01:52 >>>>>>>>>>> To: JMeter Users List >>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching >>>>>>>>>>> >>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga >>>>>>>>>>> <[email protected] <javascript:;>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have an issue, which might well be a potential bug, where a >>>>>>>>>>> >>>>>>>>>> posted >>>>>>>>> >>>>>>>>> value >>>>>>>>>>> >>>>>>>>>>> is >>>>>>>>>>> >>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching >>>>>>>>>>> functionality. >>>>>>>>>>> >>>>>>>>>>> The request I'm recording has a post value containing XML (SAML >>>>>>>>>>> >>>>>>>>>> token to >>>>>>>>> >>>>>>>>> be >>>>>>>>>>> >>>>>>>>>>> exact) which I'd like to replace with a variable automatically. >>>>>>>>>>> >>>>>>>>>>> What does the value look like? >>>>>>>>>>> Does it have multiple lines? >>>>>>>>>>> >>>>>>>>>>> No, it did not have multiple lines. I did check if this was the >>>>>>>>>>> >>>>>>>>>> case, but >>>>>>>>> >>>>>>>>> it >>>>>>>>>>> >>>>>>>>>>> wasn't >>>>>>>>>>> >>>>>>>>>>> For testing purposes I have configured a User Defined Variable >>>>>>>>>>> >>>>>>>>>> (called >>>>>>>>> >>>>>>>>> TEST) >>>>>>>>>>> >>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well >>>>>>>>>>> (all >>>>>>>>>>> without >>>>>>>>>>> double >>>>>>>>>>> quotes). >>>>>>>>>>> >>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}> >>>>>>>>>>> >>>>>>>>>>> That does not make sense. >>>>>>>>>>> ".*" will match everything, including < and >, so the content >>>>>>>>>>> would >>>>>>>>>>> become >>>>>>>>>>> ${TEST} >>>>>>>>>>> >>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug. >>>>>>>>>>> >>>>>>>>>>> I've tried other expressions as well and I'm able to match >>>>>>>>>>> anything >>>>>>>>>>> within >>>>>>>>>>> the >>>>>>>>>>> >>>>>>>>>>> <> characters, but not those characters itself. >>>>>>>>>>> >>>>>>>>>>> Again, that does not make sense. >>>>>>>>>>> >>>>>>>>>>> The weird thing is, that inside the outer <> characters there are >>>>>>>>>>> >>>>>>>>>> other >>>>>>>>> >>>>>>>>> <> >>>>>>>>>>> >>>>>>>>>>> characters that are matched fine. It's just the first and last >>>>>>>>>>> >>>>>>>>>> character. >>>>>>>>> >>>>>>>>> Does anyone else have experienced the same thing, or is this a >>>>>>>>>> >>>>>>>>>> known >>>>>>>>> >>>>>>>>> issue? >>>>>>>>>>> >>>>>>>>>>> It is not a known issue, and may not even be an issue. >>>>>>>>>>> >>>>>>>>>>> Or should I post this in the developer's mailing list? >>>>>>>>>>> >>>>>>>>>>> No, the developers all follow this list. >>>>>>>>>>> >>>>>>>>>>> Great, please see attachment for an example. >>>>>>>>>>> >>>>>>>>>>> Cheers >>>>>>>>>>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
