Your types should not generally have fields of type Function, they should
only contain object state. The way you're sticking function objects into
types so that you can do Python-style o-o programming in Julia is an
antipattern and will result in terrible performance. Instead, you should
define methods for external generic functions, that dispatch on your types.
If you have a foo::Foo object, you should be doing f(foo) not foo.f().


On Fri, Mar 27, 2015 at 6:52 PM, Michael Bullman <[email protected]>
wrote:

> Hi Stefan,
>
> I'm attaching the code I'm running. I'll type out a description of how the
> code runs.
>
> The code is supposed to simulate two different load balancing algorithms
> in a system between a load balancer and a server cluster.
>
> I define two user Type: Fyle and node
> Fyle has 3 field definitions Size::Float54, OrigSize::Float64, and
> lifetime::Int
> Initially it had functions to retrieve these values, but those have been
> commented out and I replaced them to be Fyle.size to get that value.
>
> Node is to simulate a sever in the cluster, it's fields and functions
> definitions are a bit more in-depth.
> Node has 5 Fields
> Threads: which is an Array which stores Fyle's each entry in the Array is
> supposed to simulate a process thread.
> MaxThreads::Int, gives the length of the threads Array, I've been thinking
> of using to actually limit the Aray size, but haven't implemented that yet.
> complete:: Also an array of Fyle's, after a files is completed it is moved
> to the complete Array and is removed from the threads array
> percent::Float64 gets calculated by a length(threads)/maxthreads in a
> function
>
> add_file push!()'s a fyle into the threads array
> pop_file pop!() on threads
> process_threads::subtracts 0.25 from each Fyle in the threads array, when
> a Fyle's size drops below 0 it is removed from the threads Array then
> pushed inthe completed Array
> thread_utilization computes percent and returns it
> to_string actually does nothing but is carry over form the Python code so
> I kept it.
>
> so those are the two types which are the basis of the simulation.
>
> The Simulation is initiated in main(seed, time_sim), in the current
> implementation the seed is used to call srand(seed) before generating the
> file inputs for reproducibility.
> For each time-step t we have a number of files which come into the system.
> This is based off of a fit of data + a randomly generated noise term. When
> the Fyle Input is being generated it's an Array of Fyle Array's . So each
> entry can hold a different number of file to be processed. In the original
> python implementation this was done int he actual simulation loop at each
> step a new list of files would be generated. But to try and speed up the
> julia code I moved this generation to before the simulation then use the
> same Fyle input twice to test each algorithm. Rather than generate the same
> input file twice.
>
> I also define a bunch of path names here for output files so I can very
> and Analyze my results, these are defined in main() but passed to run(),
> currently two are not actually used, to cut down on writes per time-step.
> At each time steps the Fyles are distribute to the nodes in the cluster
> Array based on either the round robin algorithm or a least threads
> utilization algorithm. And then all the nodes process their thread at the
> end of each time-step.
>
> Every 60 time steps results are averaged, data is written to a .csv file,
> and data structure are reinitialized to begin a fresh 60 time-step
> measurement.
>
> I think that mostly covers the code from a high level. Only other thing I
> can think of is that every time a Fyle is generated in the initialization
> phase it performs and inverse_transform_sample to randomly generate Fyle
> sizes. This involves a binary search through an Array, in the original
> Python implementation this was a linear search.
>
> I've really tried to make minor tweaks between the two versions to speed
> them up but so far no luck.
>
> Thanks for any help
>
> On Wednesday, March 25, 2015 at 3:32:29 AM UTC-4, Stefan Karpinski wrote:
>>
>> Yes, writing to a file is one of the slower things you can do. So if
>> that's in a performance-critical loop it will very much slow things down.
>> But that would be true for Python and PyPy as well. Are you doing the same
>> thing in that code?
>>
>>
>> > On Mar 25, 2015, at 4:00 AM, Michael Bullman <[email protected]>
>> wrote:
>> >
>> > Hi Guys,
>> >
>> > So I just went back through my code. I didn't see any global variables.
>> I'm going to try and start using the @time macro tomorrow to try and
>> identify the worse functions. Would writes to file significantly impact
>> speed? I know looking on google writing to files is frowned upon, but what
>> is a better alternative? Hold everything in an Array until the program
>> finishes then write out at the end? Are data bases a viable option when
>> output is very large? Or when records need to be kept?
>> >
>> > I'm also going over the code again and might post a copy if people are
>> interested, but I'm not going to be doing that tonight.
>> >
>> > Thanks again
>>
>

Reply via email to