Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 15:36:21 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote: On Fri, 09 Jan 2015 13:54:00 + Robert burner Schadek via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote: if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more. std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. import std.regex; auto ctr = ctRegex!(`(home|office|sea|plane)`); auto c2 = !matchFirst(He is in the sea., ctr).empty; -- Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138ms 1. stop doing captures in regexp, this will speedup the comparison. 2. your sample is very artificial. i was talking about alot more keywords and alot longer strings. sorry, i wasn't told that clear enough. signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 15:57:21 UTC, ketmar via Digitalmars-d-learn wrote: On Fri, 09 Jan 2015 15:36:21 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote: On Fri, 09 Jan 2015 13:54:00 + Robert burner Schadek via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote: if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more. std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. import std.regex; auto ctr = ctRegex!(`(home|office|sea|plane)`); auto c2 = !matchFirst(He is in the sea., ctr).empty; -- Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138ms 1. stop doing captures in regexp, this will speedup the comparison. 2. your sample is very artificial. i was talking about alot more keywords and alot longer strings. sorry, i wasn't told that clear enough. Yes. regex doing 'a lot more keywords and a lot longer strings' will be better. Thank you.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote: On Fri, 09 Jan 2015 13:54:00 + Robert burner Schadek via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote: if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more. std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. import std.regex; auto ctr = ctRegex!(`(home|office|sea|plane)`); auto c2 = !matchFirst(He is in the sea., ctr).empty; -- Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138ms
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
iday, 9 January 2015 at 07:41:07 UTC, ketmar via Digitalmars-d-learn wrote: On Fri, 09 Jan 2015 07:10:14 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner Schadek wrote: use canFind like such: bool a = canFind(strs,s) = 1; let the compiler figger out what the types of the parameter are. canFind is work for such as : bool x = canFind([exe,lib,a,dll],a ); but can't work for canFind([exe,lib,a,dll],hello.lib); So I very want to let the function 'indexOfAny' do the same work. Thank you. Frank be creative! ;-) import std.algorithm, std.stdio; void main () { string fname = hello.exe; import std.path : extension; if (findAmong([fname.extension], [.exe, .lib, .a, .dll]).length) { writeln(got it!); } else { writeln(alas...); } } note the dots in extension list. yet you can do it even easier: import std.algorithm, std.stdio; void main () { string fname = hello.exe; import std.path : extension; if ([.exe, .lib, .a, .dll].canFind(fname.extension)) { writeln(got it!); } else { writeln(alas...); } } as you obviously interested in extension here -- check only that part! ;-) Sorry,it's only a example .Thank you work hard,but it's not what I want. 'indexOfAny ' function should do this work. ”he is at home ,[home,office,”sea,plane], in C#,IndexOfAny can do it,what about in D? I know findAmong can do it,but use two function . Thank you.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 09:36:01 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: Sorry,it's only a example .Thank you work hard,but it's not what I want. 'indexOfAny ' function should do this work. ”he is at home ,[home,office,”sea,plane], in C#,IndexOfAny can do it,what about in D? I know findAmong can do it,but use two function . Thank you. be creative! ;-) import std.algorithm, std.stdio; void main () { string s = he is at plane; if (findAmong!((string a, string b) = b.canFind(a))([s], [home, office, sea, plane]).length) { writeln(got it!); } else { writeln(alas...); } } or: import std.algorithm, std.stdio; void main () { string s = he is at home; if ([home, office, sea, plane].canFind!((a, string b) = b.canFind(a))(s)) { writeln(got it!); } else { writeln(alas...); } } signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote: std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. even with CTFE regex still uses a state machine _mm256_cmpeq_epi8 will beat that even for multiple strings. Basically all lexer are handwritten, if regex where fast enough nobody would do the work.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 14:11:49 + Robert burner Schadek via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote: std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. even with CTFE regex still uses a state machine _mm256_cmpeq_epi8 will beat that even for multiple strings. Basically all lexer are handwritten, if regex where fast enough nobody would do the work. heh. regexps *are* fast enough. it's hard to beat well-optimised generated thingy on a complex grammar. ;-) signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote: if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 13:54:00 + Robert burner Schadek via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote: if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more. std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns. signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
be creative! ;-) import std.algorithm, std.stdio; void main () { string s = he is at plane; if (findAmong!((string a, string b) = b.canFind(a))([s], [home, office, sea, plane]).length) { writeln(got it!); } else { writeln(alas...); } } or: import std.algorithm, std.stdio; void main () { string s = he is at home; if ([home, office, sea, plane].canFind!((a, string b) = b.canFind(a))(s)) { writeln(got it!); } else { writeln(alas...); } } The code is the best,and it's better than indexOfAny in C#: import std.algorithm, std.stdio; void main () { auto places = [ home, office, sea,plane]; auto strWhere = He is in the sea.; auto where = places.canFind!(a = strWhere.canFind(a)); writeln(Result is ,where); }
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 12:46:53 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: The code is the best,and it's better than indexOfAny in C#: import std.algorithm, std.stdio; void main () { auto places = [ home, office, sea,plane]; auto strWhere = He is in the sea.; auto where = places.canFind!(a = strWhere.canFind(a)); writeln(Result is ,where); } this does unnecessary upvalue access (`strWhere`). try to avoid such stuff whenever it is possible. signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 10:02:53 UTC, ketmar via Digitalmars-d-learn wrote: import std.algorithm, std.stdio; void main () { string s = he is at home; if ([home, office, sea, plane].canFind!((a, string b) = b.canFind(a))(s)) { writeln(got it!); } else { writeln(alas...); } } Thank you. The code is the best,and it's better than indexOfAny in C#: /* places.canFind!(a = strWhere.canFind(a)); */ By auto r = benchmark!(f0,f1, f2, f3,f4)(10_); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ---5 functions-- import std.stdio, std.algorithm,std.string; auto places = [ home, office, sea,plane]; auto strWhere = He is in the sea.; void main() { auto where = places.filter!(a = strWhere.indexOf(a) != -1); writeln(0 Result is ,where); auto where1 = findAmong(places,strWhere); writeln(1 Result is ,where1); string where2; foreach(a;places) { if(strWhere.indexOf(a) !=-1) { where2 = a; break; } } writeln(2 Result is ,where2); auto where3 = places.canFind!(a = strWhere.canFind(a)); writeln(3 Result is ,where3); auto where4 = places.canFind!(a = strWhere.indexOf(a) != -1); writeln(4 Result is ,where4); } Frank
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 13:06:09 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Friday, 9 January 2015 at 10:02:53 UTC, ketmar via Digitalmars-d-learn wrote: import std.algorithm, std.stdio; void main () { string s = he is at home; if ([home, office, sea, plane].canFind!((a, string b) = b.canFind(a))(s)) { writeln(got it!); } else { writeln(alas...); } } Thank you. The code is the best,and it's better than indexOfAny in C#: /* places.canFind!(a = strWhere.canFind(a)); */ By auto r = benchmark!(f0,f1, f2, f3,f4)(10_); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ---5 functions-- import std.stdio, std.algorithm,std.string; auto places = [ home, office, sea,plane]; auto strWhere = He is in the sea.; void main() { auto where = places.filter!(a = strWhere.indexOf(a) != -1); writeln(0 Result is ,where); auto where1 = findAmong(places,strWhere); writeln(1 Result is ,where1); string where2; foreach(a;places) { if(strWhere.indexOf(a) !=-1) { where2 = a; break; } } writeln(2 Result is ,where2); auto where3 = places.canFind!(a = strWhere.canFind(a)); writeln(3 Result is ,where3); auto where4 = places.canFind!(a = strWhere.indexOf(a) != -1); writeln(4 Result is ,where4); } Frank if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so. signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Friday, 9 January 2015 at 14:21:04 UTC, ketmar via Digitalmars-d-learn wrote: heh. regexps *are* fast enough. it's hard to beat well-optimised generated thingy on a complex grammar. ;-) I don't see your point, anyway I think he got his help or at least some help.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner Schadek wrote: use canFind like such: bool a = canFind(strs,s) = 1; let the compiler figger out what the types of the parameter are. canFind is work for such as : bool x = canFind([exe,lib,a,dll],a ); but can't work for canFind([exe,lib,a,dll],hello.lib); So I very want to let the function 'indexOfAny' do the same work. Thank you. Frank
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Fri, 09 Jan 2015 07:10:14 + FrankLike via Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote: On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner Schadek wrote: use canFind like such: bool a = canFind(strs,s) = 1; let the compiler figger out what the types of the parameter are. canFind is work for such as : bool x = canFind([exe,lib,a,dll],a ); but can't work for canFind([exe,lib,a,dll],hello.lib); So I very want to let the function 'indexOfAny' do the same work. Thank you. Frank be creative! ;-) import std.algorithm, std.stdio; void main () { string fname = hello.exe; import std.path : extension; if (findAmong([fname.extension], [.exe, .lib, .a, .dll]).length) { writeln(got it!); } else { writeln(alas...); } } note the dots in extension list. yet you can do it even easier: import std.algorithm, std.stdio; void main () { string fname = hello.exe; import std.path : extension; if ([.exe, .lib, .a, .dll].canFind(fname.extension)) { writeln(got it!); } else { writeln(alas...); } } as you obviously interested in extension here -- check only that part! ;-) signature.asc Description: PGP signature
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 17:08:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote: Try this: http://dlang.org/phobos-prerelease/std_algorithm#.findAmong T You mean ? The result is not that I want to get! ---test.d-- import std.stdio, std.algorithm,std.string; auto ext =[exe,lib,a,dll]; auto strs = hello.exe; void main() { auto b = findAmong(ext,strs); writeln(b is ,b); } -result- b is [exe,lib,a,dll] note: 1. I only want to find the given string 'hello.exe' whether to include any a string in the [exe,lib,a,dll]. 2. I think the 'indexOfAny' function of string.d do the same work with 'indexOf',This is not as it should be. Frank
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 17:08:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote: Try this: http://dlang.org/phobos-prerelease/std_algorithm#.findAmong T Thank you,it can work. but it's not what I want. ---test.d-- import std.stdio, std.algorithm,std.string; auto ext =[exe,lib,a,dll]; auto strs = hello.dll; void main() { auto b = findAmong(ext,strs); writeln(b is ,b); } -result- b is [dll] I think if 'indexOfAny' function of string.d do the work ,it should be ok. such as : auto b = hello.dll.indexOfAny([exe,lib,a,dll]); writeln(b is ,b); The result should be 'true',if it can work. Can you suggest 'phobos' to update 'indexOfAny' fuction? Thank you. Frank
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
use canFind like such: bool a = canFind(strs,s) = 1; let the compiler figger out what the types of the parameter are.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
FrankLike: But now I want to know in a string (like hello.exe or hello.a,or hello.dll or hello.lib ) whether contains any of them: [exe,dll,a,lib]. Seems this: http://rosettacode.org/wiki/File_extension_is_in_extensions_list#D Bye, bearophile
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 15:11:57 UTC, John Colvin wrote: On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote: I want to know whether the string strs contains 'exe','dll','a','lib',in c#, I can do : int index = indexofany(strs,[exe,dll,a,lib]); but in D: I must to do like this: findStr(strs,[exe,lib,dll,a])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frank std.algorithm.canFind will do what you want, including telling you which of [exe,lib,dll,a] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find Sorry, 'std.algorithm.find' do this work:Finds an individual element in an input range,and it's Parameters: InputRange haystack The range searched in. Element needle The element searched for. But now I want to know in a string (like hello.exe or hello.a,or hello.dll or hello.lib ) whether contains any of them: [exe,dll,a,lib]. My function 'findStr' works fine. If the string.d's function 'indexOfAny' do this work,it will happy.(but now 'IndexOfAny' and 'indexOf' do the same work) . Thank you.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote: I want to know whether the string strs contains 'exe','dll','a','lib',in c#, I can do : int index = indexofany(strs,[exe,dll,a,lib]); but in D: I must to do like this: findStr(strs,[exe,lib,dll,a])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frank std.algorithm.canFind will do what you want, including telling you which of [exe,lib,dll,a] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
std.algorithm.find has several overloads, one of which takes multiple needles. The same is true for std.algorithm.canFind Quoting from the relevant std.algorithm.find overload docs: Finds two or more needles into a haystack. string strs =hello.exe; string[] s =[lib,exe,a,dll]; auto a = canFind!(string,string[])(strs,s); writeln(a is ,a); string strsb =hello.; auto b = canFind!(string,string[])(strsb,s); writeln(b is ,b); Get error: does not match template declaration canFind(alias pred = a ==b) you can test it. Thank you.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
Try this: http://dlang.org/phobos-prerelease/std_algorithm#.findAmong T -- MACINTOSH: Most Applications Crash, If Not, The Operating System Hangs
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 15:57:18 UTC, FrankLike wrote: On Wednesday, 7 January 2015 at 15:11:57 UTC, John Colvin wrote: On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote: I want to know whether the string strs contains 'exe','dll','a','lib',in c#, I can do : int index = indexofany(strs,[exe,dll,a,lib]); but in D: I must to do like this: findStr(strs,[exe,lib,dll,a])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frank std.algorithm.canFind will do what you want, including telling you which of [exe,lib,dll,a] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find Sorry, 'std.algorithm.find' do this work:Finds an individual element in an input range,and it's Parameters: InputRange haystack The range searched in. Element needle The element searched for. std.algorithm.find has several overloads, one of which takes multiple needles. The same is true for std.algorithm.canFind Quoting from the relevant std.algorithm.find overload docs: Finds two or more needles into a haystack.
Re: Why do the same work about 'IndexOfAny' and 'indexOf' function?
On Wednesday, 7 January 2015 at 16:02:25 UTC, bearophile wrote: FrankLike: But now I want to know in a string (like hello.exe or hello.a,or hello.dll or hello.lib ) whether contains any of them: [exe,dll,a,lib]. Seems this: http://rosettacode.org/wiki/File_extension_is_in_extensions_list#D Bye, bearophile Which uses this overload: size_t canFind(Range, Ranges...)(Range haystack, Ranges needles)