I am training my data using following code:


    start_time := clock_timestamp();
      PERFORM madlib.create_nb_prepared_data_tables( 'nb_training',
                                                     'class',
                                                     'attributes',
                                                     
'ARRAY[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57]',
                                                     57,
                                                     'categ_feature_probs',
                                                     'numeric_attr_params',
                                                     'class_priors'
                                                   );
      training_time := 1000* (extract(epoch FROM clock_timestamp()) - 
extract(epoch FROM start_time));

And my prediction code goes as follows:

    start_time := clock_timestamp();
      PERFORM madlib.create_nb_probs_view( 'categ_feature_probs',
                                           'class_priors',
                                           'nb_testing',
                                           'id',
                                           'attributes',
                                           57,
                                           'numeric_attr_params',
                                           'probs_view' );

    select * from probs_view
    prediction_time := 1000 * (extract(epoch FROM clock_timestamp()) - 
extract(epoch FROM start_time));

  The training data is containing 450000 records were as testing dataset 
contains 50000 records.

Still, my average training_time is around 17173 ms where as prediction_time is 
26481 ms. As per my understanding of naive bayes, the prediction_time should be 
less than training_time. What am I doing wrong here?

Reply via email to