bearophile wrote: > You are right, sorry. No need to apologise! You helped me significantly improve my code, helped me understand D a lot better and left me feeling generally very positive about developing further in D. I'd call that a result. :-)
>> So I think it's probably just compiler difference that's to blame for speed >> differences< > > This can be true, but such differences are not magic, they can be found, and > eventually they can even become precise enhancement requests for llvm > developers, they are open to such requests. I hope my questions here have been useful in this respect ... >> After all, D in general and DMD in particular is in development.< > > DMD has a quite old back-end that is not in active development. Maybe it will > become 64 bit. The DMD backend seems a generally problematic piece of code given the proprietary restrictions that apply to it. I have the impression that it's being used simply because it's the backend that is known and tested and that over the longer term there may be a switch ... ? The one compiler-related concern I have is whether there will be an effective free-as-in-freedom compiler for D2 in the near future. Work is going on GDC but it's moving slowly -- and I don't see a clear LDC roadmap to D2 support. > In this program I have seen that a good percentage of the speed difference > between ldc/dmd is caused by foreach loops. When you iterate over structs > longer than size_t use ref (or in D2 better ref const(...)). > foreach (Rating r; ratings) > ==> > foreach (ref const(Rating) r; ratings) > > There is nothing algorithmically strange going on here, but to be sure you > can take a look at the produced asm. Interesting ... ! I'm afraid the asm is over my head, but I will read up and try to understand it. > To find why the C++ code is faster you can show me the equivalent C++ > code that I can compile myself, or you can compile your C++ code and > show me the asm of the two functions that are hot spots. Sure -- attached. The key functions are the ones in tref.cpp (the smalltref class) and trefsim.cpp, and the treftest.cpp file that runs a roughly-equivalent simulation to test.d. It's a little embarrassing to share because it's fairly badly-designed code -- my first serious C++ attempt -- it served its purpose some time ago and was done with. It started to become vaguely relevant again so I thought I'd try and redo it with a better design, and rather than do it in C++ (which I could do much better by now) this seemed like a nice excuse to build my first D project ... :-) I already made a couple of tweaks based on your improvements to the D code, but I think g++ probably already has such optimizations during the compile process. I also realised that I wasn't comparing like with like -- the code was using the RanLux random number generator, and with mt19937 it shaves quite a bit of time off. So C++ still has a bit of an edge right now. There's surely a lot of other stuff that can be done to improve this C++, but please don't put lots of effort in unless it helps with improving D (the language and/or frontend, not my code, which is not so important:-). Thanks & best wishes, -- Joe
#ifndef __TREF_HPP__ #define __TREF_HPP__ #include <vector> typedef long unsigned int objectID; typedef long unsigned int userID; namespace tref { class rating { public: objectID o; userID u; double value; rating(objectID _o, userID _u, double _value); }; // class __tref // Base class for all iterative refinement algorithms. // Contains only minimally needed info and functions. class __tref { protected: bool verbose; std::vector<tref::rating> ratings; std::vector<double> quality; std::vector<double> weight; double beta; double epsilon; double convergence; virtual void addLink(objectID o, userID u); virtual void iterationInit(); virtual double qualityUpdate(); virtual void weightUpdate(); public: __tref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose); void addRating(objectID o, userID u, double value); unsigned int iterationCycle(); long unsigned int objects(); long unsigned int users(); long unsigned int links(); double objectQuality(objectID o); double userWeight(userID u); void printRatings(); }; // class smalltref : public __tref // Memory-efficient but slow algorithm. class smalltref : public __tref { protected: std::vector<double> qualityLast; std::vector<double> weightSum; std::vector<unsigned long int> userLinks; virtual void addLink(objectID o, userID u); virtual double qualityUpdate(); virtual void weightUpdate(); public: smalltref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose); }; // class bigtref : public __tref // Supposedly faster algorithm, but needs more memory. // Extra allocation required may actually make it slower // in many cases. class bigtref : public __tref { protected: std::vector< std::vector<tref::rating *> > objectRatings; std::vector< std::vector<tref::rating *> > userRatings; std::vector<unsigned long int> objectLinks; std::vector<unsigned long int> userLinks; virtual void iterationInit(); virtual double qualityUpdate(); virtual void weightUpdate(); public: bigtref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose); }; } #endif /* __TREF_HPP__ */
#include <cmath> #include <iostream> #include <vector> #include "tref.hpp" namespace tref { // class rating // public: rating::rating(objectID _o, userID _u, double _value) { o = _o; u = _u; value = _value; } // class __tref // protected: void __tref::addLink(objectID o, userID u) {} void __tref::iterationInit() {} double __tref::qualityUpdate() { return 0; } void __tref::weightUpdate() {} // class __tref // public: __tref::__tref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose) { verbose = _verbose; ratings.reserve(_links); if(verbose) { std::cout << "Reserved space for " << ratings.capacity() << " links." << std::endl; std::cout << "Maximum possible: " << ratings.max_size() << std::endl; } quality.resize(_objects,0); weight.resize(_users,1); beta = _beta; epsilon = _epsilon; convergence = _convergence; } void __tref::addRating(objectID o, userID u, double value) { ratings.push_back(tref::rating(o,u,value)); if(o>=quality.size()) quality.resize(o+1,0); if(u>=weight.size()) weight.resize(u+1,1); addLink(o,u); } unsigned int __tref::iterationCycle() { iterationInit(); if(verbose) { std::cout << "Performing iteration cycle with " << quality.size() << " objects, "; std::cout << weight.size() << " users" << std::endl; std::cout << "and " << ratings.size() << " links, "; std::cout << "beta = " << beta << ", epsilon = " << epsilon << std::endl; std::cout << "and convergence condition = " << convergence << std::endl; } unsigned int iter = 0; double qdiff = qualityUpdate(); do { ++iter; weightUpdate(); qdiff = qualityUpdate(); } while(qdiff > convergence); if(verbose) std::cout << "Exited in " << iter << " iterations with qdiff = " << qdiff << std::endl; return iter; } long unsigned int __tref::objects() { return quality.size(); } long unsigned int __tref::users() { return weight.size(); } long unsigned int __tref::links() { return ratings.size(); } double __tref::objectQuality(objectID o) { return quality[o]; } double __tref::userWeight(userID u) { return weight[u]; } void __tref::printRatings() { for(std::vector<tref::rating>::iterator r=ratings.begin();r!=ratings.end();++r) std::cout << r->o << "\t" << r->u << "\t" << r->value << std::endl; } // class smalltref : public __tref // protected: void smalltref::addLink(objectID o, userID u) { if(o>=qualityLast.size()) { qualityLast.resize(o+1); weightSum.resize(o+1); } if(u>=userLinks.size()) userLinks.resize(u+1,0); ++(userLinks[u]); } double smalltref::qualityUpdate() { double qdiff = 0; qualityLast.assign(quality.begin(),quality.end()); quality.assign(quality.size(),0); weightSum.assign(weightSum.size(),0); for(std::vector<tref::rating>::iterator r=ratings.begin();r!=ratings.end();++r) { quality[r->o] += weight[r->u] * r->value; weightSum[r->o] += weight[r->u]; } for(objectID o=0;o!=quality.size();++o) { if(weightSum[o]>0) quality[o] /= weightSum[o]; double aux = quality[o]-qualityLast[o]; qdiff += aux*aux; } return sqrt(qdiff); } void smalltref::weightUpdate() { weight.assign(weight.size(),0); for(std::vector<tref::rating>::iterator r=ratings.begin();r!=ratings.end();++r) { double aux = r->value - quality[r->o]; weight[r->u] += aux*aux; } for(userID u=0;u!=weight.size();++u) { if(userLinks[u]!=0) weight[u] = pow( ((weight[u]/userLinks[u]) + epsilon), -beta); } } // class smalltref : public __tref // public: smalltref::smalltref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose) : __tref(_objects,_users,_links, _beta,_epsilon,_convergence, _verbose) { qualityLast.resize(_objects); weightSum.resize(_objects); userLinks.resize(_users,0); } // class bigtref : public __tref // protected: void bigtref::iterationInit() { for(objectID o=0;o!=objectRatings.size();++o) objectRatings[o].reserve(objectLinks[o]); for(userID u=0;u!=userRatings.size();++u) userRatings[u].reserve(userLinks[u]); for(std::vector<tref::rating>::iterator r=ratings.begin();r!=ratings.end();++r) { objectRatings[r->o].push_back(&(*r)); userRatings[r->u].push_back(&(*r)); } } double bigtref::qualityUpdate() { double qdiff = 0; for(objectID o=0;o!=objectRatings.size();++o) { double qold = quality[o]; double weightSum = quality[o] = 0; for(std::vector<tref::rating *>::iterator r=objectRatings[o].begin();r!=objectRatings[o].end();++r) { quality[o] += weight[(*r)->u] * (*r)->value; weightSum += weight[(*r)->u]; } if(weightSum>0) quality[o] /= weightSum; qdiff += pow((quality[o]-qold),2); } return sqrt(qdiff); } void bigtref::weightUpdate() { for(userID u=0;u!=userRatings.size();++u) { if(userRatings[u].size()>0) { weight[u] = 0; for(std::vector<tref::rating *>::iterator r=userRatings[u].begin();r!=userRatings[u].end();++r) weight[u] += pow( ((*r)->value - quality[(*r)->o]), 2); if(weight[u]==0) weight[u] = epsilon; weight[u] = pow( (weight[u]/userRatings[u].size()), -beta); } } } // class bigtref : public __tref // public: bigtref::bigtref(long unsigned int _objects, long unsigned int _users, long unsigned int _links, double _beta, double _epsilon, double _convergence, bool _verbose) : __tref(_objects,_users,_links, _beta,_epsilon,_convergence, _verbose) { objectRatings.resize(_objects); userRatings.resize(_users); objectLinks.resize(_objects,0); userLinks.resize(_users,0); } }
#ifndef __TREFSIM_HPP__ #define __TREFSIM_HPP__ #include <cmath> #include <set> #include <utility> #include <gsl/gsl_randist.h> #include <gsl/gsl_rng.h> #include <gsl/gsl_sort.h> #include <gsl/gsl_statistics.h> #include "tref.hpp" namespace tref { template <class TREF> class sim { protected: TREF *tref; gsl_rng *r; bool verbose; std::vector<double> objectQuality; std::vector<double> userAbility; unsigned long int objects; unsigned long int users; double links; double linksMax; double qualityScale; double abilityScale; double ratingScale; long unsigned int iter; virtual void generateObjectQuality() { for(std::vector<double>::iterator Q=objectQuality.begin();Q!=objectQuality.end();++Q) *Q = qualityScale * gsl_rng_uniform_pos(r); } virtual void generateUserAbility() { for(std::vector<double>::iterator sigma2=userAbility.begin();sigma2!=userAbility.end();++sigma2) *sigma2 = abilityScale * gsl_rng_uniform_pos(r); } virtual double rateObject(objectID o, userID u) { return ( ( ratingScale * userAbility[u] * ((2*gsl_rng_uniform(r))-1) ) + objectQuality[o] ); } void generateRatings() { if(links!=linksMax) { std::vector<long unsigned int> objectLinks(objects,0); std::vector<long unsigned int> userLinks(users,0); std::set< std::pair<objectID,userID> > L; unsigned long int unlinkedObjects = 0, unlinkedUsers = 0; for(long unsigned int l=0;l!=links;++l) { std::pair<std::set< std::pair<objectID,userID> >::iterator,bool> ret; objectID o; userID u; do { o = gsl_rng_uniform_int(r,objects); u = gsl_rng_uniform_int(r,users); ret = L.insert(std::pair<objectID,userID>(o,u)); } while(!(ret.second)); tref->addRating(o,u,rateObject(o,u)); ++(objectLinks[o]); ++(userLinks[u]); //std::cout << "Added a link between object " << o << " and user " << u << std::endl; } for(objectID o=0;o!=objects;++o) { if(objectLinks[o]==0) { //std::cout << "Object " << o << " has no links!" << std::endl; ++unlinkedObjects; } } for(userID u=0;u!=users;++u) { if(userLinks[u]==0) { //std::cout << "User " << u << " has no links!" << std::endl; ++unlinkedUsers; } } if(verbose && ((unlinkedObjects==0) || (unlinkedUsers==0))) { std::cout << unlinkedObjects << " objects and " << unlinkedUsers << " users "; std::cout << "have no links!" << std::endl; } } else { for(userID u=0;u!=users;++u) for(objectID o=0;o!=objects;++o) tref->addRating(o,u,rateObject(o,u)); } } public: sim(unsigned long int _objects, unsigned long int _users, double sparsity, double beta, double epsilon, double convergence, double _qualityScale, double _abilityScale, double _ratingScale, gsl_rng *_r, bool _verbose) { verbose = _verbose; objects = _objects; users = _users; qualityScale = _qualityScale; abilityScale = _abilityScale; ratingScale = _ratingScale; objectQuality.resize(objects); userAbility.resize(users); r = _r; linksMax = ((double) objects)*((double) users); if(verbose) std::cout << "Maximum possible links: " << linksMax << std::endl; links = sparsity*linksMax; if(verbose) std::cout << "Actual number of links: " << links << std::endl; tref = new TREF(objects,users,links,beta,epsilon,convergence,verbose); } ~sim() { delete tref; } void simulation() { generateObjectQuality(); generateUserAbility(); generateRatings(); iter = tref->iterationCycle(); } double qualityDelta() { double Delta = 0; for(objectID o=0;o!=objects;++o) Delta += pow((tref->objectQuality(o)-objectQuality[o]),2); return sqrt(Delta/objects); } long unsigned int iterations() { return iter; } }; template <class TREF> class paretosim : public sim<TREF> { virtual double rateObject(objectID o, userID u) { double Cf = 2; if(this->ratingScale > 3) Cf = (this->ratingScale-2)*(this->ratingScale-3); double value = gsl_ran_pareto(this->r,this->ratingScale-1,sqrt(0.5*Cf*this->userAbility[u])); return (((gsl_rng_uniform(this->r)<0.5)?-value:value) + this->objectQuality[o]); } public: paretosim(unsigned long int _objects, unsigned long int _users, double sparsity, double beta, double epsilon, double convergence, double _qualityScale, double _abilityScale, double _ratingScale, gsl_rng *_r, bool _verbose) : sim<TREF> (_objects, _users, sparsity, beta, epsilon, convergence, _qualityScale, _abilityScale, _ratingScale, _r, _verbose) { } }; template <class TREF> class llsim : public sim<TREF> { protected: virtual double rateObject(objectID o, userID u) { double value; do { value = sim<TREF>::rateObject(o,u); } while (value<0 || value > this->qualityScale); return value; } public: llsim(unsigned long int _objects, unsigned long int _users, double sparsity, double beta, double epsilon, double convergence, double _qualityScale, double _abilityScale, double _ratingScale, gsl_rng *_r, bool _verbose) : sim<TREF> (_objects, _users, sparsity, beta, epsilon, convergence, _qualityScale, _abilityScale, _ratingScale, _r, _verbose) { } }; template <class TREF> class intsim : public sim<TREF> { protected: virtual double rateObject(objectID o, userID u) { double value = sim<TREF>::rateObject(o,u); value = floor(value+0.5); return value; } public: intsim(unsigned long int _objects, unsigned long int _users, double sparsity, double beta, double epsilon, double convergence, double _qualityScale, double _abilityScale, double _ratingScale, gsl_rng *_r, bool _verbose) : sim<TREF> (_objects, _users, sparsity, beta, epsilon, convergence, _qualityScale, _abilityScale, _ratingScale, _r, _verbose) { } }; template <template <class T> class TREFSIM, class TREF> class multisim { protected: std::vector<double> qualityDelta; std::vector<double> iterations; double mean(std::vector<double> &v) { return gsl_stats_mean(&(v[0]),1,v.size()); } double median(std::vector<double> &v) { std::vector<double> w(v.begin(),v.end()); gsl_sort(&(w[0]),1,w.size()); return gsl_stats_median_from_sorted_data(&(w[0]),1,w.size()); } double sd(std::vector<double> &v) { return gsl_stats_sd(&(v[0]),1,v.size()); } double min(std::vector<double> &v) { return gsl_stats_min(&(v[0]),1,v.size()); } double max(std::vector<double> &v) { return gsl_stats_max(&(v[0]),1,v.size()); } public: multisim(unsigned long int sims, unsigned long int objects, unsigned long int users, double sparsity, double beta, double epsilon, double convergence, double qualityScale, double abilityScale, double ratingScale, gsl_rng *r, bool verbose) { long unsigned int iter_total = 0; qualityDelta.resize(sims); iterations.resize(sims); for(unsigned long int s=0;s!=sims;++s) { TREFSIM<TREF> sim(objects,users, sparsity,beta,epsilon,convergence, qualityScale,abilityScale,ratingScale, r,verbose); sim.simulation(); qualityDelta[s] = sim.qualityDelta(); iter_total += iterations[s] = sim.iterations(); if(verbose) std::cout << "[" << s << "]\t" << qualityDelta[s] << "\t" << iterations[s] << std::endl; } if(verbose) std::cout << "Total iterations: " << iter_total << std::endl; } double qualityDeltaMean() { return mean(qualityDelta); } double qualityDeltaMedian() { return median(qualityDelta); } double qualityDeltaSD() { return sd(qualityDelta); } double qualityDeltaMin() { return min(qualityDelta); } double qualityDeltaMax() { return max(qualityDelta); } double iterationsMean() { return mean(iterations); } double iterationsMedian() { return median(iterations); } double iterationsSD() { return sd(iterations); } double iterationsMin() { return min(iterations); } double iterationsMax() { return max(iterations); } void stats() { std::cout << "\t" << qualityDeltaMean(); std::cout << "\t" << qualityDeltaMin(); std::cout << "\t" << qualityDeltaMax(); std::cout << "\t" << iterationsMean(); std::cout << "\t" << iterationsMin(); std::cout << "\t" << iterationsMax() << std::endl; } }; } #endif /* __TREFSIM_HPP__ */
#include <iostream> #include <gsl/gsl_rng.h> #include "tref.hpp" #include "trefsim.hpp" int main() { long unsigned int i, l; tref::smalltref sys(10,5,50,0.5,0,1e-12,true); unsigned int M = sys.objects(), N = sys.users(); gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937); for(i=0;i<N;++i) for(l=0;l<M;++l) sys.addRating(l,i,gsl_rng_uniform_pos(r)); std::cout << "Raw ratings: " << std::endl; sys.printRatings(); std::cout << std::endl; i = sys.iterationCycle(); std::cout << "Calculated optimum quality in " << i << " iterations..." << std::endl; for(l=0;l<M;++l) std::cout << "Object " << l << " weighted quality estimate: " << sys.objectQuality(l) << std::endl; std::cout << std::endl; gsl_rng_set(r,1001); tref::multisim<tref::sim,tref::smalltref> sim(/* sims */ 100, /*objects*/ 1000, /*users*/ 1000, /*sparsity*/ 1, /*beta*/ 0.8, /*epsilon*/ 1e-36, /*convergence*/ 1e-12, /*qualityScale*/ 10, /*abilityScale*/ 1, /*ratingScale*/ 1, /*rng*/ r, /*verbose*/ true); std::cout << "Delta = " << sim.qualityDeltaMean() << " +/- " << sim.qualityDeltaSD(); std::cout << " calculated in " << sim.iterationsMean() << " +/- " << sim.iterationsSD() << " iterations." << std::endl; gsl_rng_free(r); return 0; }