Well Julia newbie here! I intend to implement a number of Bayesian 
hierarchical clustering models (more specifically topic models) in Julia 
and here is my implementation for Latent Dirichlet Allocation 
<https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation> as a gist:
https://gist.github.com/odinay/3e49d50ba580a9bff8e3


I shall say my Julia implementation is almost 100 times faster than my 
Python(NumPy) implementation. For instance for a simulated dataset from 5 
clusters with 1000 observations each containing 100 points:


true_kk = 5
n_groups = 1000
n_group_j = 100 * ones(Int64, n_groups)


Julia spends nearly 0.1 sec for each LDA Gibbs sampling iteration while it 
takes almost 9.5 sec in Python on my machine. But the code is still slow 
for real datasets. I know that Gibbs Inference for these models is 
expensive in nature. But how can I make sure I have optimised the 
performance of my code to the best. For example for a slightly bigger 
dataset such as


true_kk =20
n_groups = 1000
n_group_j =1000 *ones(Int64, n_groups)


the output is:


iteration: 98, number of components: 20, elapsed time: 3.209459973         
           
iteration: 99, number of components: 20, elapsed time: 3.265090272         
           
iteration: 100, number of components: 20, elapsed time: 3.204902689         
          
elapsed time: 332.600401208 seconds (20800255280 bytes allocated, 12.87% gc 
time)     


As I move to more complex models, optimizing the code to the best becomes a 
bigger concern. How can I make sure *without changing the algorithm *(I 
don't want to use other Bayesian approaches like variational methods or 
so), this is the best performance I can get?  Also parallelization is not 
the answer. Although efficient parallel Gibbs sampling for LDA has been 
proposed (e.g. here <http://lccc.eecs.berkeley.edu/Slides/Gonzalez10.pdf>), 
it is not the case for more complex statistical models. Thus I want to know 
if I am tuning the loops and passing vars and types correctly or it can be 
done more efficiently.


What made me unsure of my work is the huge amount of data that is 
allocated, almost 20 GB. I am aware that since numbers are immutable types, 
Julia has to copy them for manipulation and calculations. But considering 
the complexity of my problem (3 nested loops) and size of my data, maybe 
based on your experience you can tell if moving around 20 GB is normal or I 
am doing something wrong?


Best, 

Adham


julia> versioninfo()
Julia Version 0.3.11
Commit 483dbf5* (2015-07-27 06:18 UTC)
Platform Info:
  System: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Reply via email to